[2602.21447] Adversarial Intent is a Latent Variable: Stateful Trust Inference for Securing Multimodal Agentic RAG
Summary
The paper presents a novel framework, MMA-RAG^T, for enhancing the security of multimodal agentic retrieval-augmented generation systems by inferring adversarial intent as a latent variable.
Why It Matters
This research addresses critical vulnerabilities in multimodal AI systems that can be exploited through adversarial strategies. By introducing a stateful defense mechanism, it significantly improves the resilience of AI models against attacks, which is essential for ensuring the reliability and safety of AI applications in real-world scenarios.
Key Takeaways
- MMA-RAG^T framework infers adversarial intent using a Partially Observable Markov Decision Process (POMDP).
- Demonstrated a 6.50x reduction in Attack Success Rate compared to undefended models.
- Statefulness and spatial coverage are critical for effective defense mechanisms.
- The framework operates as a model-agnostic overlay, enhancing existing systems without requiring fundamental changes.
- Findings validate the importance of checkpoint detections in improving security outcomes.
Computer Science > Cryptography and Security arXiv:2602.21447 (cs) [Submitted on 24 Feb 2026] Title:Adversarial Intent is a Latent Variable: Stateful Trust Inference for Securing Multimodal Agentic RAG Authors:Inderjeet Singh, Vikas Pahuja, Aishvariya Priya Rathina Sabapathy, Chiara Picardi, Amit Giloni, Roman Vainshtein, Andrés Murillo, Hisashi Kojima, Motoyoshi Sekiya, Yuki Unno, Junichi Suga View a PDF of the paper titled Adversarial Intent is a Latent Variable: Stateful Trust Inference for Securing Multimodal Agentic RAG, by Inderjeet Singh and 10 other authors View PDF HTML (experimental) Abstract:Current stateless defences for multimodal agentic RAG fail to detect adversarial strategies that distribute malicious semantics across retrieval, planning, and generation components. We formulate this security challenge as a Partially Observable Markov Decision Process (POMDP), where adversarial intent is a latent variable inferred from noisy multi-stage observations. We introduce MMA-RAG^T, an inference-time control framework governed by a Modular Trust Agent (MTA) that maintains an approximate belief state via structured LLM reasoning. Operating as a model-agnostic overlay, MMA-RAGT mediates a configurable set of internal checkpoints to enforce stateful defence-in-depth. Extensive evaluation on 43,774 instances demonstrates a 6.50x average reduction factor in Attack Success Rate relative to undefended baselines, with negligible utility cost. Crucially, a factorial ablation...