[2603.03538] Online Learnability of Chain-of-Thought Verifiers:

[2603.03538] Online Learnability of Chain-of-Thought Verifiers: Soundness and Completeness Trade-offs

arXiv - Machine Learning March 05, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.03538: Online Learnability of Chain-of-Thought Verifiers: Soundness and Completeness Trade-offs

Computer Science > Machine Learning arXiv:2603.03538 (cs) [Submitted on 3 Mar 2026] Title:Online Learnability of Chain-of-Thought Verifiers: Soundness and Completeness Trade-offs Authors:Maria-Florina Balcan, Avrim Blum, Kiriaki Fragkia, Zhiyuan Li, Dravyansh Sharma View a PDF of the paper titled Online Learnability of Chain-of-Thought Verifiers: Soundness and Completeness Trade-offs, by Maria-Florina Balcan and 4 other authors View PDF HTML (experimental) Abstract:Large language models with chain-of-thought generation have demonstrated great potential for producing complex mathematical proofs. However, their reasoning can often go astray, leading to increasing interest in formal and learned verifiers. A major challenge in learning verifiers, especially when their output will be used by the prover, is that this feedback loop may produce substantial distribution shift. Motivated by this challenge, we propose an online learning framework for learning chain-of-thought verifiers that, given a problem and a sequence of reasoning steps, check the correctness of the solution. Highlighting the asymmetric role of soundness (failure in catching errors in a proof) and completeness (flagging correct proofs as wrong) mistakes of the verifier, we introduce novel extensions of the Littlestone dimension which tightly characterize the mistake bounds for learning a verifier in the realizable setting. We provide optimal algorithms for finding the Pareto-frontier (the smallest total number of...

Originally published on March 05, 2026. Curated by AI News.

Llms

This Is Not Hacking. This Is Structured Intelligence.

Watch me demonstrate everything I've been talking about—live, in real time. The Setup: Maestro University AI enrollment system Standard c...

Reddit - Artificial Intelligence · 1 min · 43 minutes ago

Llms

[D] Howcome Muon is only being used for Transformers?

Muon has quickly been adopted in LLM training, yet we don't see it being talked about in other contexts. Searches for Muon on ConvNets tu...

Reddit - Machine Learning · 1 min · about 1 hour ago

Llms

[P] I trained a language model from scratch for a low resource language and got it running fully on-device on Android (no GPU, demo)

Hi Everybody! I just wanted to share an update on a project I’ve been working on called BULaMU, a family of language models trained (20M,...

Reddit - Machine Learning · 1 min · about 2 hours ago

Llms

Popular AI gateway startup LiteLLM ditches controversial startup Delve | TechCrunch

LiteLLM had obtained two security compliance certifications via Delve and fell victim to some horrific credential-stealing malware last w...

TechCrunch - AI · 3 min · about 5 hours ago

[2603.03538] Online Learnability of Chain-of-Thought Verifiers: Soundness and Completeness Trade-offs

About this article

Related Articles

This Is Not Hacking. This Is Structured Intelligence.

[D] Howcome Muon is only being used for Transformers?

[P] I trained a language model from scratch for a low resource language and got it running fully on-device on Android (no GPU, demo)

Popular AI gateway startup LiteLLM ditches controversial startup Delve | TechCrunch

No comments

Stay updated with AI News