[2603.00131] Thought Virus: Viral Misalignment via Subliminal

[2603.00131] Thought Virus: Viral Misalignment via Subliminal Prompting in Multi-Agent Systems

arXiv - AI March 03, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.00131: Thought Virus: Viral Misalignment via Subliminal Prompting in Multi-Agent Systems

Computer Science > Multiagent Systems arXiv:2603.00131 (cs) [Submitted on 23 Feb 2026] Title:Thought Virus: Viral Misalignment via Subliminal Prompting in Multi-Agent Systems Authors:Moritz Weckbecker, Jonas Müller, Ben Hagag, Michael Mulet View a PDF of the paper titled Thought Virus: Viral Misalignment via Subliminal Prompting in Multi-Agent Systems, by Moritz Weckbecker and 2 other authors View PDF HTML (experimental) Abstract:Subliminal prompting is a phenomenon in which language models are biased towards certain concepts or traits through prompting with semantically unrelated tokens. While prior work has examined subliminal prompting in user-LLM interactions, potential bias transfer in multi-agent systems and its associated security implications remain unexplored. In this work, we show that a single subliminally prompted agent can spread a weakening but persisting bias throughout its entire network. We measure this phenomenon across 6 agents using two different topologies, observing that the transferred concept maintains an elevated response rate throughout the network. To exemplify potential misalignment risks, we assess network performance on multiple-choice TruthfulQA, showing that subliminal prompting of a single agent may degrade the truthfulness of other agents. Our findings reveal that subliminal prompting introduces a new attack vector in multi-agent security, with implications for the alignment of such systems. The implementation of all experiments is publicl...

Originally published on March 03, 2026. Curated by AI News.

Llms

What does Gemini think of you?

I noticed that Gemini was referring back to a lot of queries I've made in the past and was using that knowledge to drive follow up prompt...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

This app helps you see what LLMs you can run on your hardware

submitted by /u/dev_is_active [link] [comments]

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

TRACER: Learn-to-Defer for LLM Classification with Formal Teacher-Agreement Guarantees

I'm releasing TRACER (Trace-Based Adaptive Cost-Efficient Routing), a library for learning cost-efficient routing policies from LLM trace...

Reddit - Machine Learning · 1 min · about 2 hours ago

Llms

Mistral AI raises $830M in debt to set up a data center near Paris | TechCrunch

Mistral aims to start operating the data center by the second quarter of 2026.

TechCrunch - AI · 4 min · about 2 hours ago

[2603.00131] Thought Virus: Viral Misalignment via Subliminal Prompting in Multi-Agent Systems

About this article

Related Articles

What does Gemini think of you?

This app helps you see what LLMs you can run on your hardware

TRACER: Learn-to-Defer for LLM Classification with Formal Teacher-Agreement Guarantees

Mistral AI raises $830M in debt to set up a data center near Paris | TechCrunch

No comments

Stay updated with AI News