[2603.12510] Red-Teaming Vision-Language-Action Models via Quality Diversity Prompt Generation for Robust Robot Policies
About this article
Abstract page for arXiv paper 2603.12510: Red-Teaming Vision-Language-Action Models via Quality Diversity Prompt Generation for Robust Robot Policies
Computer Science > Robotics arXiv:2603.12510 (cs) [Submitted on 12 Mar 2026 (v1), last revised 3 Apr 2026 (this version, v3)] Title:Red-Teaming Vision-Language-Action Models via Quality Diversity Prompt Generation for Robust Robot Policies Authors:Siddharth Srikanth, Freddie Liang, Ya-Chuan Hsu, Varun Bhatt, Shihan Zhao, Henry Chen, Bryon Tjanaka, Minjune Hwang, Akanksha Saran, Daniel Seita, Aaquib Tabrez, Stefanos Nikolaidis View a PDF of the paper titled Red-Teaming Vision-Language-Action Models via Quality Diversity Prompt Generation for Robust Robot Policies, by Siddharth Srikanth and 11 other authors View PDF HTML (experimental) Abstract:Vision-Language-Action (VLA) models have significant potential to enable general-purpose robotic systems for a range of vision-language tasks. However, the performance of VLA-based robots is highly sensitive to the precise wording of language instructions, and it remains difficult to predict when such robots will fail. We propose Quality Diversity (QD) optimization as a natural framework for red-teaming embodied models, and present Q-DIG (Quality Diversity for Diverse Instruction Generation), which performs red-teaming by scalably identifying diverse, natural language task descriptions that induce failures while remaining task-relevant. Q-DIG integrates QD techniques with Vision-Language Models (VLMs) to generate a broad spectrum of adversarial instructions that expose meaningful vulnerabilities in VLA behavior. Our results across mul...