[2505.15008] Know When to Abstain: Optimal Selective Classification with Likelihood Ratios
Summary
The paper discusses optimal selective classification using likelihood ratios, enhancing predictive model reliability by allowing abstention from uncertain predictions, especially under covariate shift conditions.
Why It Matters
This research is significant as it addresses the challenge of selective classification in machine learning, particularly when the test data distribution differs from training data. By applying the Neyman-Pearson lemma, it offers a unified approach to improve model performance in real-world scenarios, making it relevant for practitioners and researchers in AI and machine learning.
Key Takeaways
- Selective classification can improve model reliability by allowing abstention from uncertain predictions.
- The Neyman-Pearson lemma provides a theoretical foundation for optimal selection functions.
- Proposed methods outperform existing baselines in various vision and language tasks.
- The study highlights the importance of addressing covariate shift in machine learning applications.
- Publicly available code supports further research and application of the proposed methods.
Computer Science > Machine Learning arXiv:2505.15008 (cs) [Submitted on 21 May 2025 (v1), last revised 14 Feb 2026 (this version, v2)] Title:Know When to Abstain: Optimal Selective Classification with Likelihood Ratios Authors:Alvin Heng, Harold Soh View a PDF of the paper titled Know When to Abstain: Optimal Selective Classification with Likelihood Ratios, by Alvin Heng and 1 other authors View PDF HTML (experimental) Abstract:Selective classification enhances the reliability of predictive models by allowing them to abstain from making uncertain predictions. In this work, we revisit the design of optimal selection functions through the lens of the Neyman--Pearson lemma, a classical result in statistics that characterizes the optimal rejection rule as a likelihood ratio test. We show that this perspective not only unifies the behavior of several post-hoc selection baselines, but also motivates new approaches to selective classification which we propose here. A central focus of our work is the setting of covariate shift, where the input distribution at test time differs from that at training. This realistic and challenging scenario remains relatively underexplored in the context of selective classification. We evaluate our proposed methods across a range of vision and language tasks, including both supervised learning and vision-language models. Our experiments demonstrate that our Neyman--Pearson-informed methods consistently outperform existing baselines, indicating that ...