[2603.21301] enhancing reasoning accuracy in large language models during inference time
About this article
Abstract page for arXiv paper 2603.21301: enhancing reasoning accuracy in large language models during inference time
Computer Science > Computation and Language arXiv:2603.21301 (cs) [Submitted on 22 Mar 2026] Title:enhancing reasoning accuracy in large language models during inference time Authors:Vinay Sharma, Manish Jain View a PDF of the paper titled enhancing reasoning accuracy in large language models during inference time, by Vinay Sharma and 1 other authors View PDF HTML (experimental) Abstract:Large Language Models (LLMs) often exhibit strong linguistic abilities while remaining unreliable on multi-step reasoning tasks, particularly when deployed without additional training or fine-tuning. In this work, we study inference-time techniques to improve the reasoning accuracy of LLMs. We systematically evaluate three classes of inference-time strategies: (i) self-consistency via stochastic decoding, where the model is sampled multiple times using controlled temperature and nucleus sampling and the most frequent final answer is selected; (ii) dual-model reasoning agreement, where outputs from two independent models are compared and only consistent reasoning traces are trusted; and (iii) self-reflection, where the model critiques and revises its own reasoning. Across all evaluated methods, we employ Chain-of-Thought (CoT) [1] prompting to elicit explicit intermediate reasoning steps before generating final answers. In this work, we provide a controlled comparative evaluation across three inference-time strategies under identical prompting and verification settings. Our experiments on L...