[2601.13518] AgenticRed: Evolving Agentic Systems for Red-Teaming
About this article
Abstract page for arXiv paper 2601.13518: AgenticRed: Evolving Agentic Systems for Red-Teaming
Computer Science > Artificial Intelligence arXiv:2601.13518 (cs) [Submitted on 20 Jan 2026 (v1), last revised 3 Apr 2026 (this version, v3)] Title:AgenticRed: Evolving Agentic Systems for Red-Teaming Authors:Jiayi Yuan, Jonathan Nöther, Natasha Jaques, Goran Radanović View a PDF of the paper titled AgenticRed: Evolving Agentic Systems for Red-Teaming, by Jiayi Yuan and 3 other authors View PDF HTML (experimental) Abstract:While recent automated red-teaming methods show promise for systematically exposing model vulnerabilities, most existing approaches rely on human-specified workflows. This dependence on manually designed workflows suffers from human biases and makes exploring the broader design space expensive. We introduce AgenticRed, an automated pipeline that leverages LLMs' in-context learning to iteratively design and refine red-teaming systems without human intervention. Rather than optimizing attacker policies within predefined structures, AgenticRed treats red-teaming as a system design problem, and it autonomously evolves automated red-teaming systems using evolutionary selection and generational knowledge. Red-teaming systems designed by AgenticRed consistently outperform state-of-the-art approaches, achieving 96% attack success rate (ASR) on Llama-2-7B, 98% on Llama-3-8B and 100% on Qwen3-8B on HarmBench. Our approach generates robust, query-agnostic red-teaming systems that transfer strongly to the latest proprietary models, achieving an impressive 100% ASR on...