[2505.15008] Know When to Abstain: Optimal Selective Classification with Likelihood Ratios

[2505.15008] Know When to Abstain: Optimal Selective Classification with Likelihood Ratios

arXiv - Machine Learning 4 min read Article

Summary

The paper discusses optimal selective classification using likelihood ratios, enhancing predictive model reliability by allowing abstention from uncertain predictions, especially under covariate shift conditions.

Why It Matters

This research is significant as it addresses the challenge of selective classification in machine learning, particularly when the test data distribution differs from training data. By applying the Neyman-Pearson lemma, it offers a unified approach to improve model performance in real-world scenarios, making it relevant for practitioners and researchers in AI and machine learning.

Key Takeaways

  • Selective classification can improve model reliability by allowing abstention from uncertain predictions.
  • The Neyman-Pearson lemma provides a theoretical foundation for optimal selection functions.
  • Proposed methods outperform existing baselines in various vision and language tasks.
  • The study highlights the importance of addressing covariate shift in machine learning applications.
  • Publicly available code supports further research and application of the proposed methods.

Computer Science > Machine Learning arXiv:2505.15008 (cs) [Submitted on 21 May 2025 (v1), last revised 14 Feb 2026 (this version, v2)] Title:Know When to Abstain: Optimal Selective Classification with Likelihood Ratios Authors:Alvin Heng, Harold Soh View a PDF of the paper titled Know When to Abstain: Optimal Selective Classification with Likelihood Ratios, by Alvin Heng and 1 other authors View PDF HTML (experimental) Abstract:Selective classification enhances the reliability of predictive models by allowing them to abstain from making uncertain predictions. In this work, we revisit the design of optimal selection functions through the lens of the Neyman--Pearson lemma, a classical result in statistics that characterizes the optimal rejection rule as a likelihood ratio test. We show that this perspective not only unifies the behavior of several post-hoc selection baselines, but also motivates new approaches to selective classification which we propose here. A central focus of our work is the setting of covariate shift, where the input distribution at test time differs from that at training. This realistic and challenging scenario remains relatively underexplored in the context of selective classification. We evaluate our proposed methods across a range of vision and language tasks, including both supervised learning and vision-language models. Our experiments demonstrate that our Neyman--Pearson-informed methods consistently outperform existing baselines, indicating that ...

Related Articles

Machine Learning

[D] Your Agent, Their Asset: Real-world safety evaluation of OpenClaw agents (CIK poisoning raises attack success to ~64–74%)

Paper: https://arxiv.org/abs/2604.04759 This paper presents a real-world safety evaluation of OpenClaw, a personal AI agent with access t...

Reddit - Machine Learning · 1 min ·
Anthropic debuts preview of powerful new AI model Mythos in new cybersecurity initiative
Machine Learning

Anthropic debuts preview of powerful new AI model Mythos in new cybersecurity initiative

The new model will be used by a small number of high-profile companies to engage in defensive cybersecurity work.

TechCrunch - AI · 5 min ·
Anthropic says its latest AI model is too powerful for public release and that it broke containment during testing
Machine Learning

Anthropic says its latest AI model is too powerful for public release and that it broke containment during testing

Anthropic said Claude Mythos is too good at hacking and that's why you won't be able to use it anytime soon.

AI Tools & Products · 5 min ·
Anthropic's latest AI model identifies 'thousands of zero-day vulnerabilities' in 'every major operating system and every major web browser' — Claude Mythos Preview sparks race to fix critical bugs, some unpatched for decades
Llms

Anthropic's latest AI model identifies 'thousands of zero-day vulnerabilities' in 'every major operating system and every major web browser' — Claude Mythos Preview sparks race to fix critical bugs, some unpatched for decades

Anthropic holds back its most advanced model yet to allow companies and institutions to prepare.

AI Tools & Products · 6 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime