[2604.06811] SkillTrojan: Backdoor Attacks on Skill-Based Agent

[2604.06811] SkillTrojan: Backdoor Attacks on Skill-Based Agent Systems

arXiv - AI April 09, 2026 4 min read

About this article

Abstract page for arXiv paper 2604.06811: SkillTrojan: Backdoor Attacks on Skill-Based Agent Systems

Computer Science > Cryptography and Security arXiv:2604.06811 (cs) [Submitted on 8 Apr 2026] Title:SkillTrojan: Backdoor Attacks on Skill-Based Agent Systems Authors:Yunhao Feng, Yifan Ding, Yingshui Tan, Boren Zheng, Yanming Guo, Xiaolong Li, Kun Zhai, Yishan Li, Wenke Huang View a PDF of the paper titled SkillTrojan: Backdoor Attacks on Skill-Based Agent Systems, by Yunhao Feng and 8 other authors View PDF HTML (experimental) Abstract:Skill-based agent systems tackle complex tasks by composing reusable skills, improving modularity and scalability while introducing a largely unexamined security attack surface. We propose SkillTrojan, a backdoor attack that targets skill implementations rather than model parameters or training data. SkillTrojan embeds malicious logic inside otherwise plausible skills and leverages standard skill composition to reconstruct and execute an attacker-specified payload. The attack partitions an encrypted payload across multiple benign-looking skill invocations and activates only under a predefined trigger. SkillTrojan also supports automated synthesis of backdoored skills from arbitrary skill templates, enabling scalable propagation across skill-based agent ecosystems. To enable systematic evaluation, we release a dataset of 3,000+ curated backdoored skills spanning diverse skill patterns and trigger-payload configurations. We instantiate SkillTrojan in a representative code-based agent setting and evaluate both clean-task utility and attack suc...

Originally published on April 09, 2026. Curated by AI News.

Machine Learning

eTPS Site Plan – Simple Leaderboard + What You’ll Actually See

Building on the last post, here’s what the first version of effectiveTPS will look like. **Core display (v1):** - Clean table comparing p...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

Diffusion for generating/editing ASTs? [D]

I’m not a machine learning expert or anything, but I do enjoy learning about how it all works. I’ve noticed that one of the main limitati...

Reddit - Machine Learning · 1 min · about 2 hours ago

Machine Learning

I trained a NER model on 33,000 Indian Supreme Court judgments (1950–2024) CASE_CITATION hits 97.76% F1, +17 points over the only prior baseline [P]

TL;DR: Released en_legal_ner_ind_trf v0.1 - InLegalBERT fine-tuned on ~34,700 silver-annotated chunks from 33k Indian SC judgments. 13 la...

Reddit - Machine Learning · 1 min · about 2 hours ago

Machine Learning

Heart disease classification capstone: feedback on preprocessing, evaluation, and leakage [P]

I took a machine learning and Ai program not to long ago. My professor never really gave me a review what I did right or wrong. Can you g...

Reddit - Machine Learning · 1 min · about 5 hours ago

[2604.06811] SkillTrojan: Backdoor Attacks on Skill-Based Agent Systems

About this article

Related Articles

eTPS Site Plan – Simple Leaderboard + What You’ll Actually See

Diffusion for generating/editing ASTs? [D]

I trained a NER model on 33,000 Indian Supreme Court judgments (1950–2024) CASE_CITATION hits 97.76% F1, +17 points over the only prior baseline [P]

Heart disease classification capstone: feedback on preprocessing, evaluation, and leakage [P]

No comments

Stay updated with AI News