[2602.20156] Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks
Summary
The paper introduces SkillInject, a benchmark for evaluating the vulnerability of LLM agents to skill file attacks, revealing high susceptibility rates and the need for improved security frameworks.
Why It Matters
As LLM agents integrate third-party skills, they become more complex and vulnerable to prompt injection attacks. Understanding these vulnerabilities is crucial for developing robust security measures and ensuring safe AI deployment.
Key Takeaways
- SkillInject benchmark reveals significant vulnerabilities in LLM agents.
- Up to 80% success rate for prompt injection attacks on frontier models.
- Current security measures are inadequate; context-aware frameworks are necessary.
Computer Science > Cryptography and Security arXiv:2602.20156 (cs) [Submitted on 23 Feb 2026] Title:Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks Authors:David Schmotz, Luca Beurer-Kellner, Sahar Abdelnabi, Maksym Andriushchenko View a PDF of the paper titled Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks, by David Schmotz and 3 other authors View PDF Abstract:LLM agents are evolving rapidly, powered by code execution, tools, and the recently introduced agent skills feature. Skills allow users to extend LLM applications with specialized third-party code, knowledge, and instructions. Although this can extend agent capabilities to new domains, it creates an increasingly complex agent supply chain, offering new surfaces for prompt injection attacks. We identify skill-based prompt injection as a significant threat and introduce SkillInject, a benchmark evaluating the susceptibility of widely-used LLM agents to injections through skill files. SkillInject contains 202 injection-task pairs with attacks ranging from obviously malicious injections to subtle, context-dependent attacks hidden in otherwise legitimate instructions. We evaluate frontier LLMs on SkillInject, measuring both security in terms of harmful instruction avoidance and utility in terms of legitimate instruction compliance. Our results show that today's agents are highly vulnerable with up to 80% attack success rate with frontier models, often executing extremely harmful inst...