[2603.01025] One-Token Verification for Reasoning Correctness Estimation
About this article
Abstract page for arXiv paper 2603.01025: One-Token Verification for Reasoning Correctness Estimation
Computer Science > Machine Learning arXiv:2603.01025 (cs) [Submitted on 1 Mar 2026] Title:One-Token Verification for Reasoning Correctness Estimation Authors:Zhan Zhuang, Xiequn Wang, Zebin Chen, Feiyang Ye, Ying Wei, Kede Ma, Yu Zhang View a PDF of the paper titled One-Token Verification for Reasoning Correctness Estimation, by Zhan Zhuang and 6 other authors View PDF HTML (experimental) Abstract:Recent breakthroughs in large language models (LLMs) have led to notable successes in complex reasoning tasks, such as mathematical problem solving. A common strategy for improving performance is parallel thinking, in which multiple reasoning traces are generated and the final prediction is made using aggregation schemes like majority voting or best-of-$N$ decoding. However, two key challenges persist. First, multi-sample decoding incurs substantial inference latency, especially for long-form outputs. Second, effective mechanisms for reliably assessing the correctness of individual reasoning traces are still limited. To address these challenges, we introduce One-Token Verification (OTV), a computational method that estimates reasoning correctness in a single forward pass during generation. OTV is activated by a learnable token and integrated into the LLM via low-rank adaptation to probe internal reasoning signals through the key-value cache, supporting token-level correctness estimation at any stage of generation without disrupting primary reasoning. Experiments on mathematical r...