[2603.03800] A Rubric-Supervised Critic from Sparse Real-World

[2603.03800] A Rubric-Supervised Critic from Sparse Real-World Outcomes

arXiv - Machine Learning March 05, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.03800: A Rubric-Supervised Critic from Sparse Real-World Outcomes

Computer Science > Artificial Intelligence arXiv:2603.03800 (cs) [Submitted on 4 Mar 2026] Title:A Rubric-Supervised Critic from Sparse Real-World Outcomes Authors:Xingyao Wang, Valerie Chen, Heng Ji, Graham Neubig View a PDF of the paper titled A Rubric-Supervised Critic from Sparse Real-World Outcomes, by Xingyao Wang and 3 other authors View PDF Abstract:Academic benchmarks for coding agents tend to reward autonomous task completion, measured by verifiable rewards such as unit-test success. In contrast, real-world coding agents operate with humans in the loop, where success signals are typically noisy, delayed, and sparse. How can we bridge this gap? In this paper, we propose a process to learn a "critic" model from sparse and noisy interaction data, which can then be used both as a reward model for either RL-based training or inference-time scaling. Specifically, we introduce Critic Rubrics, a rubric-based supervision framework with 24 behavioral features that can be derived from human-agent interaction traces alone. Using a semi-supervised objective, we can then jointly predict these rubrics and sparse human feedback (when present). In experiments, we demonstrate that, despite being trained primarily from trace-observable rubrics and sparse real-world outcome proxies, these critics improve best-of-N reranking on SWE-bench (Best@8 +15.9 over Random@8 over the rerankable subset of trajectories), enable early stopping (+17.7 with 83% fewer attempts), and support training...

Originally published on March 05, 2026. Curated by AI News.

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 3 hours ago

Machine Learning

[D] Looking for definition of open-world ish learning problem

Hello! Recently I did a project where I initially had around 30 target classes. But at inference, the model had to be able to handle a lo...

Reddit - Machine Learning · 1 min · about 3 hours ago

Machine Learning

Mystery Shopping Meets Machine Learning: Can Algorithms Become the Ultimate Customer Experience Auditor?

Customer expectations across Africa are shifting faster than most organisations can track. A single inconsistent interaction can ignite a...

AI News - General · 8 min · about 4 hours ago

Machine Learning

GitHub to Use User Data for AI Training by Default

submitted by /u/i-drake [link] [comments]

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

[2603.03800] A Rubric-Supervised Critic from Sparse Real-World Outcomes

About this article

Related Articles

UMKC Announces New Master of Science in Artificial Intelligence

[D] Looking for definition of open-world ish learning problem

Mystery Shopping Meets Machine Learning: Can Algorithms Become the Ultimate Customer Experience Auditor?

GitHub to Use User Data for AI Training by Default

No comments

Stay updated with AI News