[2603.01784] Co-Evolutionary Multi-Modal Alignment via Structured Adversarial Evolution
About this article
Abstract page for arXiv paper 2603.01784: Co-Evolutionary Multi-Modal Alignment via Structured Adversarial Evolution
Computer Science > Cryptography and Security arXiv:2603.01784 (cs) [Submitted on 2 Mar 2026] Title:Co-Evolutionary Multi-Modal Alignment via Structured Adversarial Evolution Authors:Guoxin Shi, Haoyu Wang, Zaihui Yang, Yuxing Wang, Yongzhe Chang View a PDF of the paper titled Co-Evolutionary Multi-Modal Alignment via Structured Adversarial Evolution, by Guoxin Shi and 4 other authors View PDF Abstract:Adversarial behavior plays a central role in aligning large language models with human values. However, existing alignment methods largely rely on static adversarial settings, which fundamentally limit robustness, particularly in multimodal settings with a larger attack surface. In this work, we move beyond static adversarial supervision and introduce co-evolutionary alignment with evolving attacks, instantiated by CEMMA (Co-Evolutionary Multi-Modal Alignment), an automated and adaptive framework for multimodal safety alignment. We introduce an Evolutionary Attacker that decomposes adversarial prompts into method templates and harmful intents. By employing genetic operators, including mutation, crossover, and differential evolution, it enables simple seed attacks to inherit the structural efficacy of sophisticated jailbreaks. The Adaptive Defender is iteratively updated on the synthesized hard negatives, forming a closed-loop process that adapts alignment to evolving attacks. Experiments show that the Evolutionary Attacker substantially increases red-teaming jailbreak attack ...