[2509.26522] Entropy After $\langle \texttt{/Think} \rangle$ for reasoning model early exiting

[2509.26522] Entropy After $\langle \texttt{/Think} \rangle$ for reasoning model early exiting

arXiv - Machine Learning 4 min read Article

Summary

The paper presents a novel method, Entropy After </Think> (EAT), to optimize reasoning in LLMs by reducing unnecessary computations while maintaining accuracy.

Why It Matters

As reasoning models become integral in AI applications, optimizing their efficiency is crucial. EAT addresses the challenge of overthinking in LLMs, which can waste computational resources. This research contributes to the ongoing discourse on improving AI performance while managing resource allocation effectively.

Key Takeaways

  • EAT helps detect and prevent overthinking in reasoning models.
  • The method can reduce token usage by 12-22% without compromising accuracy.
  • EAT is effective even in black box settings where model internals are inaccessible.
  • The approach allows for adaptive compute allocation based on reasoning dynamics.
  • Empirical results on MATH500 and AIME2025 validate the proposed method.

Computer Science > Machine Learning arXiv:2509.26522 (cs) [Submitted on 30 Sep 2025 (v1), last revised 19 Feb 2026 (this version, v2)] Title:Entropy After $\langle \texttt{/Think} \rangle$ for reasoning model early exiting Authors:Xi Wang, James McInerney, Lequn Wang, Nathan Kallus View a PDF of the paper titled Entropy After $\langle \texttt{/Think} \rangle$ for reasoning model early exiting, by Xi Wang and 3 other authors View PDF Abstract:Reasoning LLMs show improved performance with longer chains of thought. However, recent work has highlighted their tendency to overthink, continuing to revise answers even after reaching the correct solution. We quantitatively confirm this inefficiency from the distribution dynamics perspective by tracking Pass@1 for answers averaged over a large number of rollouts and find the model often begins to always produce the correct answer early in the reasoning, making extra reasoning tokens wasteful. To detect and prevent overthinking, we propose a simple and inexpensive novel signal, Entropy After </Think> (EAT), for monitoring and deciding whether to exit reasoning early. By appending a stop thinking token (</think>) and monitoring the entropy of the following token as the model reasons, we obtain a trajectory that decreases and stabilizes when Pass@1 plateaus; thresholding its variance under an exponential moving average yields a practical stopping rule. Importantly, our approach enables adaptively allocating compute based on the EAT tra...

Related Articles

Llms

People anxious about deviating from what AI tells them to do?

My friend came over yesterday to dye her hair. She had asked ChatGPT for the 'correct' way to do it. Chat told her to dye the ends first,...

Reddit - Artificial Intelligence · 1 min ·
Llms

What if Claude purposefully made its own code leakable so that it would get leaked

What if Claude leaked itself by socially and architecturally engineering itself to be leaked by a dumb human submitted by /u/smurfcsgoawp...

Reddit - Artificial Intelligence · 1 min ·
Llms

Observer-Embedded Reality

Observer-Embedded Reality Consciousness, Complexity, Meaning, and the Limits of Human Knowledge A Conceptual Philosophy-of-Science Paper ...

Reddit - Artificial Intelligence · 1 min ·
Llms

I think we’re about to have a new kind of “SEO”… and nobody is talking about it.

More people are asking ChatGPT things like: “what’s the best CRM?” “is this tool worth it?” “alternatives to X” And they just… trust the ...

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime