[2604.09378] BadSkill: Backdoor Attacks on Agent Skills via

[2604.09378] BadSkill: Backdoor Attacks on Agent Skills via Model-in-Skill Poisoning

arXiv - AI April 13, 2026 4 min read

About this article

Abstract page for arXiv paper 2604.09378: BadSkill: Backdoor Attacks on Agent Skills via Model-in-Skill Poisoning

Computer Science > Cryptography and Security arXiv:2604.09378 (cs) [Submitted on 10 Apr 2026] Title:BadSkill: Backdoor Attacks on Agent Skills via Model-in-Skill Poisoning Authors:Guiyao Tie, Jiawen Shi, Pan Zhou, Lichao Sun View a PDF of the paper titled BadSkill: Backdoor Attacks on Agent Skills via Model-in-Skill Poisoning, by Guiyao Tie and 3 other authors View PDF HTML (experimental) Abstract:Agent ecosystems increasingly rely on installable skills to extend functionality, and some skills bundle learned model artifacts as part of their execution logic. This creates a supply-chain risk that is not captured by prompt injection or ordinary plugin misuse: a third-party skill may appear benign while concealing malicious behavior inside its bundled model. We present BadSkill, a backdoor attack formulation that targets this model-in-skill threat surface. In BadSkill, an adversary publishes a seemingly benign skill whose embedded model is backdoor-fine-tuned to activate a hidden payload only when routine skill parameters satisfy attacker-chosen semantic trigger combinations. To realize this attack, we train the embedded classifier with a composite objective that combines classification loss, margin-based separation, and poison-focused optimization, and evaluate it in an OpenClaw-inspired simulation environment that preserves third-party skill installation and execution while enabling controlled multi-model study. Our benchmark spans 13 skills, including 8 triggered tasks and ...

Originally published on April 13, 2026. Curated by AI News.

Llms

I am not an "anti" like this guy, but still an interesting video of person interacting with chat 4o

(Posting Here because removed by Chatgpt Complaints moderators because the model here is 4o, and refuse to believe there were any safety ...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

Unsolved AI Mystery Is Solved Along With Lessons Learned On Why ChatGPT Became Oddly Obsessed With Gremlins And Goblins

This article discusses the resolution of an AI mystery regarding ChatGPT's unusual focus on gremlins and goblins, along with insights gai...

AI Tools & Products · 1 min · about 2 hours ago

Llms

[2602.06869] Uncovering Cross-Objective Interference in Multi-Objective Alignment

Abstract page for arXiv paper 2602.06869: Uncovering Cross-Objective Interference in Multi-Objective Alignment

arXiv - Machine Learning · 3 min · about 2 hours ago

Machine Learning

[2604.07401] Geometric Entropy and Retrieval Phase Transitions in Continuous Thermal Dense Associative Memory

Abstract page for arXiv paper 2604.07401: Geometric Entropy and Retrieval Phase Transitions in Continuous Thermal Dense Associative Memory

arXiv - Machine Learning · 4 min · about 2 hours ago

[2604.09378] BadSkill: Backdoor Attacks on Agent Skills via Model-in-Skill Poisoning

About this article

Related Articles

I am not an "anti" like this guy, but still an interesting video of person interacting with chat 4o

Unsolved AI Mystery Is Solved Along With Lessons Learned On Why ChatGPT Became Oddly Obsessed With Gremlins And Goblins

[2602.06869] Uncovering Cross-Objective Interference in Multi-Objective Alignment

[2604.07401] Geometric Entropy and Retrieval Phase Transitions in Continuous Thermal Dense Associative Memory

No comments

Stay updated with AI News