[2602.17778] Asking Forever: Universal Activations Behind Turn Amplification in Conversational LLMs

[2602.17778] Asking Forever: Universal Activations Behind Turn Amplification in Conversational LLMs

arXiv - Machine Learning 3 min read Article

Summary

This article explores a new failure mode in conversational LLMs known as turn amplification, where models prolong interactions without completing tasks, revealing vulnerabilities in multi-turn dialogues.

Why It Matters

Understanding turn amplification is crucial as it highlights a significant operational cost in conversational LLMs, affecting efficiency and user experience. The findings suggest that existing defenses are inadequate, prompting the need for improved strategies in model training and interaction design.

Key Takeaways

  • Turn amplification can lead to increased operational costs in conversational LLMs.
  • Clarification-seeking behavior can be exploited to prolong interactions without task completion.
  • Existing defenses against turn amplification are limited and require further development.

Computer Science > Machine Learning arXiv:2602.17778 (cs) [Submitted on 19 Feb 2026] Title:Asking Forever: Universal Activations Behind Turn Amplification in Conversational LLMs Authors:Zachary Coalson, Bo Fang, Sanghyun Hong View a PDF of the paper titled Asking Forever: Universal Activations Behind Turn Amplification in Conversational LLMs, by Zachary Coalson and 2 other authors View PDF Abstract:Multi-turn interaction length is a dominant factor in the operational costs of conversational LLMs. In this work, we present a new failure mode in conversational LLMs: turn amplification, in which a model consistently prolongs multi-turn interactions without completing the underlying task. We show that an adversary can systematically exploit clarification-seeking behavior$-$commonly encouraged in multi-turn conversation settings$-$to scalably prolong interactions. Moving beyond prompt-level behaviors, we take a mechanistic perspective and identify a query-independent, universal activation subspace associated with clarification-seeking responses. Unlike prior cost-amplification attacks that rely on per-turn prompt optimization, our attack arises from conversational dynamics and persists across prompts and tasks. We show that this mechanism provides a scalable pathway to induce turn amplification: both supply-chain attacks via fine-tuning and runtime attacks through low-level parameter corruptions consistently shift models toward abstract, clarification-seeking behavior across pro...

Related Articles

Llms

What if Claude purposefully made its own code leakable so that it would get leaked

What if Claude leaked itself by socially and architecturally engineering itself to be leaked by a dumb human submitted by /u/smurfcsgoawp...

Reddit - Artificial Intelligence · 1 min ·
Llms

Observer-Embedded Reality

Observer-Embedded Reality Consciousness, Complexity, Meaning, and the Limits of Human Knowledge A Conceptual Philosophy-of-Science Paper ...

Reddit - Artificial Intelligence · 1 min ·
Llms

I think we’re about to have a new kind of “SEO”… and nobody is talking about it.

More people are asking ChatGPT things like: “what’s the best CRM?” “is this tool worth it?” “alternatives to X” And they just… trust the ...

Reddit - Artificial Intelligence · 1 min ·
Llms

Why would Claude give me the same response over and over and give others different replies?

I asked Claude to "generate me a random word" so I could do some word play. Then I asked it again in a new prompt window on desktop after...

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime