[2602.12846] Amortized Reasoning Tree Search: Decoupling Proposal and Decision in Large Language Models

[2602.12846] Amortized Reasoning Tree Search: Decoupling Proposal and Decision in Large Language Models

arXiv - Machine Learning 4 min read Article

Summary

The paper presents Amortized Reasoning Tree Search (ARTS), a novel approach to enhance reasoning in Large Language Models by decoupling proposal and decision-making processes, addressing limitations of traditional reinforcement learning methods.

Why It Matters

This research is significant as it tackles the challenge of suppressing valid reasoning paths in Large Language Models, which can lead to improved performance in complex reasoning tasks. By introducing ARTS, the authors provide a method that maintains model diversity while enhancing reasoning capabilities, which is crucial for advancing AI applications in various fields.

Key Takeaways

  • ARTS decouples proposal and decision-making to improve reasoning in LLMs.
  • The approach addresses the 'Normalization Squeeze' issue in reinforcement learning.
  • ARTS achieves competitive performance on benchmarks without altering the generative model.
  • The method shows significant recovery in performance on long-tail reasoning tasks.
  • Flow Matching objective enhances navigation through complex search spaces.

Computer Science > Machine Learning arXiv:2602.12846 (cs) [Submitted on 13 Feb 2026] Title:Amortized Reasoning Tree Search: Decoupling Proposal and Decision in Large Language Models Authors:Zesheng Hong, Jiadong Yu, Hui Pan View a PDF of the paper titled Amortized Reasoning Tree Search: Decoupling Proposal and Decision in Large Language Models, by Zesheng Hong and 2 other authors View PDF HTML (experimental) Abstract:Reinforcement Learning with Verifiable Rewards (RLVR) has established itself as the dominant paradigm for instilling rigorous reasoning capabilities in Large Language Models. While effective at amplifying dominant behaviors, we identify a critical pathology in this alignment process: the systematic suppression of valid but rare (low-likelihood under the base model distribution) reasoning paths. We theoretically characterize this phenomenon as a "Normalization Squeeze," where the interplay between mode-seeking policy gradients and finite sampling acts as a high-pass likelihood filter, driving the probability of rare correct traces to statistical extinction. To counteract this collapse without discarding the base model's latent diversity, we propose Amortized Reasoning Tree Search (ARTS). Unlike standard approaches that force internalization via parameter updates, ARTS prioritizes deliberation by decoupling generation from verification. We introduce a Flow Matching objective that repurposes the verifier to estimate the conservation of probability flow, enabling ...

Related Articles

Llms

[R] The Lyra Technique — A framework for interpreting internal cognitive states in LLMs (Zenodo, open access)

We're releasing a paper on a new framework for reading and interpreting the internal cognitive states of large language models: "The Lyra...

Reddit - Machine Learning · 1 min ·
Llms

Looking to build a production-level AI/ML project (agentic systems), need guidance on what to build

Hi everyone, I’m a final-year undergraduate AI/ML student currently focusing on applied AI / agentic systems. So far, I’ve spent time und...

Reddit - ML Jobs · 1 min ·
Llms

Google isn’t an AI-first company despite Gemini being great

Any time I see an article quoting a Google executive about how "successfully" they’ve implemented AI, I roll my eyes. People treat these ...

Reddit - Artificial Intelligence · 1 min ·
Llms

I built a 1,400-line private reflection harness for Claude with a trust contract and a door that closes from the inside. Then I ran a controlled experiment.

I'm a game developer (DIV Games Studio, 1998; Sony London) with 40 years writing engines and systems. Used Claude daily for two years as ...

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime