[2510.02249] Explore Briefly, Then Decide: Mitigating LLM Overthinking via Cumulative Entropy Regulation
About this article
Abstract page for arXiv paper 2510.02249: Explore Briefly, Then Decide: Mitigating LLM Overthinking via Cumulative Entropy Regulation
Computer Science > Computation and Language arXiv:2510.02249 (cs) [Submitted on 2 Oct 2025 (v1), last revised 21 Mar 2026 (this version, v2)] Title:Explore Briefly, Then Decide: Mitigating LLM Overthinking via Cumulative Entropy Regulation Authors:Yi Bin, Tianyi Jiang, Yujuan Ding, Kainian Zhu, Fei Ma, Jingkuan Song, Yang Yang, Heng Tao Shen View a PDF of the paper titled Explore Briefly, Then Decide: Mitigating LLM Overthinking via Cumulative Entropy Regulation, by Yi Bin and 7 other authors View PDF HTML (experimental) Abstract:Large Language Models (LLMs) have demonstrated remarkable reasoning abilities on complex problems using long Chain-of-Thought (CoT) reasoning. However, they often suffer from overthinking, meaning generating unnecessarily lengthy reasoning steps for simpler problems. This issue may degrade the efficiency of the models and make them difficult to adapt the reasoning depth to the complexity of problems. To address this, we introduce a novel metric Token Entropy Cumulative Average (TECA), which measures the extent of exploration throughout the reasoning process. We further propose a novel reasoning paradigm named "Explore Briefly, Then Decide", with an associated Cumulative Entropy Regulation (CER) mechanism. This paradigm leverages TECA to help the model dynamically determine the optimal point to conclude its thought process and provide a final answer, thus achieving efficient reasoning. Experimental results across diverse mathematical benchmarks sho...