[2602.18232] Thinking by Subtraction: Confidence-Driven Contrastive Decoding for LLM Reasoning
Summary
The paper presents a novel method called Confidence-Driven Contrastive Decoding (CCD) aimed at improving the reasoning accuracy of large language models (LLMs) by selectively addressing low-confidence tokens during decoding.
Why It Matters
This research is significant as it challenges the conventional approach of uniformly increasing computation for LLM reasoning, demonstrating that targeted interventions can enhance accuracy and efficiency. By focusing on low-confidence areas, it offers a more nuanced method for improving LLM outputs, which is crucial for applications requiring high reliability.
Key Takeaways
- CCD improves reasoning reliability by targeting low-confidence tokens.
- The method reduces unnecessary output length while enhancing accuracy.
- It operates without additional training, making it efficient for practical use.
Computer Science > Computation and Language arXiv:2602.18232 (cs) [Submitted on 20 Feb 2026] Title:Thinking by Subtraction: Confidence-Driven Contrastive Decoding for LLM Reasoning Authors:Lexiang Tang, Weihao Gao, Bingchen Zhao, Lu Ma, Qiao jin, Bang Yang, Yuexian Zou View a PDF of the paper titled Thinking by Subtraction: Confidence-Driven Contrastive Decoding for LLM Reasoning, by Lexiang Tang and 6 other authors View PDF HTML (experimental) Abstract:Recent work on test-time scaling for large language model (LLM) reasoning typically assumes that allocating more inference-time computation uniformly improves correctness. However, prior studies show that reasoning uncertainty is highly localized: a small subset of low-confidence tokens disproportionately contributes to reasoning errors and unnecessary output expansion. Motivated by this observation, we propose Thinking by Subtraction, a confidence-driven contrastive decoding approach that improves reasoning reliability through targeted token-level intervention. Our method, Confidence-Driven Contrastive Decoding, detects low-confidence tokens during decoding and intervenes selectively at these positions. It constructs a contrastive reference by replacing high-confidence tokens with minimal placeholders, and refines predictions by subtracting this reference distribution at low-confidence locations. Experiments show that CCD significantly improves accuracy across mathematical reasoning benchmarks while substantially reducing ...