[2603.17839] How do LLMs Compute Verbal Confidence

arXiv - AI April 01, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.17839: How do LLMs Compute Verbal Confidence

Computer Science > Computation and Language arXiv:2603.17839 (cs) [Submitted on 18 Mar 2026 (v1), last revised 31 Mar 2026 (this version, v2)] Title:How do LLMs Compute Verbal Confidence Authors:Dharshan Kumaran, Arthur Conmy, Federico Barbero, Simon Osindero, Viorica Patraucean, Petar Velickovic View a PDF of the paper titled How do LLMs Compute Verbal Confidence, by Dharshan Kumaran and 5 other authors View PDF HTML (experimental) Abstract:Verbal confidence -- prompting LLMs to state their confidence as a number or category -- is widely used to extract uncertainty estimates from black-box models. However, how LLMs internally generate such scores remains unknown. We address two questions: first, when confidence is computed - just-in-time when requested, or automatically during answer generation and cached for later retrieval; and second, what verbal confidence represents - token log-probabilities, or a richer evaluation of answer quality? Focusing on Gemma 3 27B and Qwen 2.5 7B, we provide convergent evidence for cached retrieval. Activation steering, patching, noising, and swap experiments reveal that confidence representations emerge at answer-adjacent positions before appearing at the verbalization site. Attention blocking pinpoints the information flow: confidence is gathered from answer tokens, cached at the first post-answer position, then retrieved for output. Critically, linear probing and variance partitioning reveal that these cached representations explain subs...

Originally published on April 01, 2026. Curated by AI News.

Llms

[2603.15970] 100x Cost & Latency Reduction: Performance Analysis of AI Query Approximation using Lightweight Proxy Models

Abstract page for arXiv paper 2603.15970: 100x Cost & Latency Reduction: Performance Analysis of AI Query Approximation using Lightweight...

arXiv - AI · 4 min · about 1 hour ago

Llms

[2603.10062] Multi-Agent Memory from a Computer Architecture Perspective: Visions and Challenges Ahead

Abstract page for arXiv paper 2603.10062: Multi-Agent Memory from a Computer Architecture Perspective: Visions and Challenges Ahead

arXiv - AI · 3 min · about 1 hour ago

Llms

[2603.09085] Not All News Is Equal: Topic- and Event-Conditional Sentiment from Finetuned LLMs for Aluminum Price Forecasting

Abstract page for arXiv paper 2603.09085: Not All News Is Equal: Topic- and Event-Conditional Sentiment from Finetuned LLMs for Aluminum ...

arXiv - AI · 4 min · about 1 hour ago

Llms

[2602.03584] $V_0$: A Generalist Value Model for Any Policy at State Zero

Abstract page for arXiv paper 2602.03584: $V_0$: A Generalist Value Model for Any Policy at State Zero

arXiv - AI · 4 min · about 1 hour ago

[2603.17839] How do LLMs Compute Verbal Confidence

About this article

Related Articles

[2603.15970] 100x Cost & Latency Reduction: Performance Analysis of AI Query Approximation using Lightweight Proxy Models

[2603.10062] Multi-Agent Memory from a Computer Architecture Perspective: Visions and Challenges Ahead

[2603.09085] Not All News Is Equal: Topic- and Event-Conditional Sentiment from Finetuned LLMs for Aluminum Price Forecasting

[2602.03584] $V_0$: A Generalist Value Model for Any Policy at State Zero

No comments

Stay updated with AI News