[2603.12681] Colluding LoRA: A Compositional Vulnerability in LLM

[2603.12681] Colluding LoRA: A Compositional Vulnerability in LLM Safety Alignment

arXiv - Machine Learning March 31, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.12681: Colluding LoRA: A Compositional Vulnerability in LLM Safety Alignment

Computer Science > Cryptography and Security arXiv:2603.12681 (cs) [Submitted on 13 Mar 2026 (v1), last revised 30 Mar 2026 (this version, v2)] Title:Colluding LoRA: A Compositional Vulnerability in LLM Safety Alignment Authors:Sihao Ding View a PDF of the paper titled Colluding LoRA: A Compositional Vulnerability in LLM Safety Alignment, by Sihao Ding View PDF HTML (experimental) Abstract:We show that safety alignment in modular LLMs can exhibit a compositional vulnerability: adapters that appear benign and plausibly functional in isolation can, when linearly composed, compromise safety. We study this failure mode through Colluding LoRA (CoLoRA), in which harmful behavior emerges only in the composition state. Unlike attacks that depend on adversarial prompts or explicit input triggers, this composition-triggered broad refusal suppression causes the model to comply with harmful requests under standard prompts once a particular set of adapters is loaded. This behavior exposes a combinatorial blind spot in current unit-centric defenses, for which exhaustive verification over adapter compositions is computationally intractable. Across several open-weight LLMs, we find that individual adapters remain benign in isolation while their composition yields high attack success rates, indicating that securing modular LLM supply-chains requires moving beyond single-module verification toward composition-aware defenses. Comments: Subjects: Cryptography and Security (cs.CR); Machine Lea...

Originally published on March 31, 2026. Curated by AI News.

Llms

[2603.23966] Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage

Abstract page for arXiv paper 2603.23966: Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage

arXiv - AI · 4 min · about 1 hour ago

Llms

[2603.16790] InCoder-32B: Code Foundation Model for Industrial Scenarios

Abstract page for arXiv paper 2603.16790: InCoder-32B: Code Foundation Model for Industrial Scenarios

arXiv - AI · 4 min · about 1 hour ago

Llms

[2603.16430] EngGPT2: Sovereign, Efficient and Open Intelligence

Abstract page for arXiv paper 2603.16430: EngGPT2: Sovereign, Efficient and Open Intelligence

arXiv - AI · 4 min · about 1 hour ago

Llms

[2603.11066] Exploring Collatz Dynamics with Human-LLM Collaboration

Abstract page for arXiv paper 2603.11066: Exploring Collatz Dynamics with Human-LLM Collaboration

arXiv - AI · 4 min · about 1 hour ago

[2603.12681] Colluding LoRA: A Compositional Vulnerability in LLM Safety Alignment

About this article

Related Articles

[2603.23966] Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage

[2603.16790] InCoder-32B: Code Foundation Model for Industrial Scenarios

[2603.16430] EngGPT2: Sovereign, Efficient and Open Intelligence

[2603.11066] Exploring Collatz Dynamics with Human-LLM Collaboration

No comments

Stay updated with AI News