10 Open Challenges Steering the Future of Vision-Language-Action Models
This AAAI paper outlines 10 core open challenges that will guide the future development of vision-language-action (VLA) models for robotics. It highlights key risks including potential harm from embodied AI systems deployed in unstructured environments like disaster zones, while framing the central goal of enabling versatile general-purpose robotic manipulation. This work provides a research agenda for the community working to extend VLA paradigms to real-world embodied systems.