[2602.19498] Softmax is not Enough (for Adaptive Conformal Classification)
Summary
The paper critiques the reliance on softmax outputs in adaptive conformal classification, proposing a new method that utilizes pre-softmax logit space to enhance prediction set adaptiveness and efficiency.
Why It Matters
This research addresses a significant limitation in conformal prediction frameworks, which are crucial for uncertainty quantification in machine learning. By improving the reliability of nonconformity scores, the proposed method could lead to more accurate and adaptable models, impacting various applications in AI and machine learning.
Key Takeaways
- Softmax outputs can lead to overconfident misclassifications in classifiers.
- The proposed method uses Helmholtz Free Energy to measure model uncertainty.
- Reweighting nonconformity scores improves prediction set adaptiveness.
- Experiments show significant improvements in efficiency and adaptiveness.
- The approach avoids post-hoc complexity, making it practical for implementation.
Computer Science > Machine Learning arXiv:2602.19498 (cs) [Submitted on 23 Feb 2026] Title:Softmax is not Enough (for Adaptive Conformal Classification) Authors:Navid Akhavan Attar, Hesam Asadollahzadeh, Ling Luo, Uwe Aickelin View a PDF of the paper titled Softmax is not Enough (for Adaptive Conformal Classification), by Navid Akhavan Attar and 3 other authors View PDF HTML (experimental) Abstract:The merit of Conformal Prediction (CP), as a distribution-free framework for uncertainty quantification, depends on generating prediction sets that are efficient, reflected in small average set sizes, while adaptive, meaning they signal uncertainty by varying in size according to input difficulty. A central limitation for deep conformal classifiers is that the nonconformity scores are derived from softmax outputs, which can be unreliable indicators of how certain the model truly is about a given input, sometimes leading to overconfident misclassifications or undue hesitation. In this work, we argue that this unreliability can be inherited by the prediction sets generated by CP, limiting their capacity for adaptiveness. We propose a new approach that leverages information from the pre-softmax logit space, using the Helmholtz Free Energy as a measure of model uncertainty and sample difficulty. By reweighting nonconformity scores with a monotonic transformation of the energy score of each sample, we improve their sensitivity to input difficulty. Our experiments with four state-of-t...