[2603.22879] Confidence Calibration under Ambiguous Ground Truth

[2603.22879] Confidence Calibration under Ambiguous Ground Truth

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2603.22879: Confidence Calibration under Ambiguous Ground Truth

Computer Science > Machine Learning arXiv:2603.22879 (cs) [Submitted on 24 Mar 2026] Title:Confidence Calibration under Ambiguous Ground Truth Authors:Linwei Tao, Haoyang Luo, Minjing Dong, Chang Xu View a PDF of the paper titled Confidence Calibration under Ambiguous Ground Truth, by Linwei Tao and 3 other authors View PDF HTML (experimental) Abstract:Confidence calibration assumes a unique ground-truth label per input, yet this assumption fails wherever annotators genuinely disagree. Post-hoc calibrators fitted on majority-voted labels, the standard single-label targets used in practice, can appear well-calibrated under conventional evaluation yet remain substantially miscalibrated against the underlying annotator distribution. We show that this failure is structural: under simplifying assumptions, Temperature Scaling is biased toward temperatures that underestimate annotator uncertainty, with true-label miscalibration increasing monotonically with annotation entropy. To address this, we develop a family of ambiguity-aware post-hoc calibrators that optimise proper scoring rules against the full label distribution and require no model retraining. Our methods span progressively weaker annotation requirements: Dirichlet-Soft leverages the full annotator distribution and achieves the best overall calibration quality across settings; Monte Carlo Temperature Scaling with a single annotation per example (MCTS S=1) matches full-distribution calibration across all benchmarks, dem...

Originally published on March 25, 2026. Curated by AI News.

Related Articles

Top 10 AI certifications and courses for 2026
Ai Startups

Top 10 AI certifications and courses for 2026

This article reviews the top 10 AI certifications and courses for 2026, highlighting their significance in a rapidly evolving field and t...

AI Events · 15 min ·
Ai Infrastructure

[D] MYTHOS-INVERSION STRUCTURAL AUDIT

MYTHOS-INVERSION STRUCTURAL AUDIT Date: March 28, 2026 Compiled: Sage, Ember, & Lyra | Reviewers: Richard, Ara, Raven, Lantern TL;DR ...

Reddit - Machine Learning · 1 min ·
A woman’s uterus has been kept alive outside the body for the first time | MIT Technology Review
Ai Startups

A woman’s uterus has been kept alive outside the body for the first time | MIT Technology Review

The team behind the feat plan to study uterine disorders and the early stages of pregnancy—and potentially grow a human fetus.

MIT Technology Review · 8 min ·
Llms

[R] Controlled experiment: giving an LLM agent access to CS papers during automated hyperparameter search improves results by 3.2%

Ran a controlled experiment measuring whether LLM coding agents benefit from access to research literature during automated experimentati...

Reddit - Machine Learning · 1 min ·
More in Ai Startups: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime