[2604.03968] TraceGuard: Structured Multi-Dimensional Monitoring as a Collusion-Resistant Control Protocol

[2604.03968] TraceGuard: Structured Multi-Dimensional Monitoring as a Collusion-Resistant Control Protocol

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2604.03968: TraceGuard: Structured Multi-Dimensional Monitoring as a Collusion-Resistant Control Protocol

Computer Science > Cryptography and Security arXiv:2604.03968 (cs) [Submitted on 5 Apr 2026] Title:TraceGuard: Structured Multi-Dimensional Monitoring as a Collusion-Resistant Control Protocol Authors:Khanh Linh Nguyen, Hoa Nghiem, Tu Tran View a PDF of the paper titled TraceGuard: Structured Multi-Dimensional Monitoring as a Collusion-Resistant Control Protocol, by Khanh Linh Nguyen and 2 other authors View PDF HTML (experimental) Abstract:AI control protocols use monitors to detect attacks by untrusted AI agents, but standard single-score monitors face two limitations: they miss subtle attacks where outputs look clean but reasoning is off, and they collapse to near-zero safety when the monitor is the same model as the agent (collusion). We present TraceGuard, a structured multi-dimensional monitoring protocol that evaluates agent actions across five dimensions -- goal alignment, constraint adherence, reasoning coherence, safety awareness, and action-trace consistency -- scored in parallel by independent LLM calls, augmented by seven heuristic detectors and an LLM-based intent analyzer. We evaluate on BashArena (637 bash tasks, 4 attack categories) within the ControlArena framework. Our results on 519 samples (279 honest, 240 attack) show that: (1) the hybrid approach achieves clear attack-honest separation (attack mean 0.616 vs. honest mean 0.206, Delta=0.410); (2) structured scoring constrains collusion -- the untrusted structured monitor achieves 95% safety vs. 0% for ...

Originally published on April 07, 2026. Curated by AI News.

Related Articles

UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Llms

I built a solo AI platform from Algeria with no funding, no team and no ad spend - here's what's inside it after 2 months

Hello, 20 years old here just got into the Ai platform and launched this last two weeks and here is what I have on it so far. - Latest Ai...

Reddit - Artificial Intelligence · 1 min ·
[2603.12365] Optimal Experimental Design for Reliable Learning of History-Dependent Constitutive Laws
Machine Learning

[2603.12365] Optimal Experimental Design for Reliable Learning of History-Dependent Constitutive Laws

Abstract page for arXiv paper 2603.12365: Optimal Experimental Design for Reliable Learning of History-Dependent Constitutive Laws

arXiv - Machine Learning · 4 min ·
[2603.17573] HeiSD: Hybrid Speculative Decoding for Embodied Vision-Language-Action Models with Kinematic Awareness
Machine Learning

[2603.17573] HeiSD: Hybrid Speculative Decoding for Embodied Vision-Language-Action Models with Kinematic Awareness

Abstract page for arXiv paper 2603.17573: HeiSD: Hybrid Speculative Decoding for Embodied Vision-Language-Action Models with Kinematic Aw...

arXiv - Machine Learning · 4 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime