[2512.01925] Rectifying LLM Thought from Lens of Optimization

arXiv - AI April 09, 2026 3 min read

About this article

Abstract page for arXiv paper 2512.01925: Rectifying LLM Thought from Lens of Optimization

Computer Science > Computation and Language arXiv:2512.01925 (cs) [Submitted on 1 Dec 2025 (v1), last revised 7 Apr 2026 (this version, v2)] Title:Rectifying LLM Thought from Lens of Optimization Authors:Junnan Liu, Hongwei Liu, Songyang Zhang, Kai Chen View a PDF of the paper titled Rectifying LLM Thought from Lens of Optimization, by Junnan Liu and 3 other authors View PDF HTML (experimental) Abstract:Recent advancements in large language models (LLMs) have been driven by their emergent reasoning capabilities, particularly through long chain-of-thought (CoT) prompting, which enables thorough exploration and deliberation. Despite these advances, long-CoT LLMs often exhibit suboptimal reasoning behaviors, such as overthinking and excessively protracted reasoning chains, which can impair performance. In this paper, we analyze reasoning processes through an optimization lens, framing CoT as a gradient descent procedure where each reasoning step constitutes an update toward problem resolution. Building on this perspective, we introduce RePro (Rectifying Process-level Reward), a novel approach to refine LLM reasoning during post-training. RePro defines a surrogate objective function to assess the optimization process underlying CoT, utilizing a dual scoring mechanism to quantify its intensity and stability. These scores are aggregated into a composite process-level reward, seamlessly integrated into reinforcement learning with verifiable rewards (RLVR) pipelines to optimize LL...

Originally published on April 09, 2026. Curated by AI News.

Llms

Diffusion for generating/editing ASTs? [D]

I’m not a machine learning expert or anything, but I do enjoy learning about how it all works. I’ve noticed that one of the main limitati...

Reddit - Machine Learning · 1 min · 8 minutes ago

Llms

ChatGPT’s ‘Trusted Contact’ will alert loved ones of safety concerns | The Verge

OpenAI is launching an optional safety feature for ChatGPT that allows adult users to assign an emergency contact for mental health and s...

The Verge - AI · 4 min · 8 minutes ago

Llms

AI is helpful but still not “there” yet

what I mean is that every time I use Claude, or Grok or any of the AI platforms and tools, I realize how far this technology is from repl...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

ChatGPT Has 'Goblin' Mania in the US. In China It Will 'Catch You Steadily' | WIRED

OpenAI's chatbot has some weird linguistic tics in Chinese that are driving users crazy.

Wired - AI · 8 min · about 1 hour ago

[2512.01925] Rectifying LLM Thought from Lens of Optimization

About this article

Related Articles

Diffusion for generating/editing ASTs? [D]

ChatGPT’s ‘Trusted Contact’ will alert loved ones of safety concerns | The Verge

AI is helpful but still not “there” yet

ChatGPT Has 'Goblin' Mania in the US. In China It Will 'Catch You Steadily' | WIRED

No comments

Stay updated with AI News