Machine Learning Ai Safety Data Science

[2602.14934] Activation-Space Uncertainty Quantification for Pretrained Networks

arXiv - Machine Learning February 17, 2026 3 min read Article

Summary

The paper presents Gaussian Process Activations (GAPA), a novel method for uncertainty quantification in pretrained networks, enhancing efficiency without altering predictions.

Why It Matters

Reliable uncertainty estimates are essential for deploying AI models safely. GAPA offers a solution that avoids the computational costs of traditional methods, making it easier to implement robust uncertainty quantification in various applications, including regression and classification tasks.

Key Takeaways

GAPA shifts Bayesian modeling from weights to activations for better uncertainty quantification.
The method preserves original predictions while providing closed-form epistemic variances.
GAPA is efficient, requiring no sampling or second-order computations, making it suitable for modern architectures.
It outperforms existing post-hoc methods in calibration and out-of-distribution detection.
Applicable across various domains including regression, classification, and language modeling.

Statistics > Machine Learning arXiv:2602.14934 (stat) [Submitted on 16 Feb 2026] Title:Activation-Space Uncertainty Quantification for Pretrained Networks Authors:Richard Bergna, Stefan Depeweg, Sergio Calvo-Ordoñez, Jonathan Plenk, Alvaro Cartea, Jose Miguel Hernández-Lobato View a PDF of the paper titled Activation-Space Uncertainty Quantification for Pretrained Networks, by Richard Bergna and 5 other authors View PDF HTML (experimental) Abstract:Reliable uncertainty estimates are crucial for deploying pretrained models; yet, many strong methods for quantifying uncertainty require retraining, Monte Carlo sampling, or expensive second-order computations and may alter a frozen backbone's predictions. To address this, we introduce Gaussian Process Activations (GAPA), a post-hoc method that shifts Bayesian modeling from weights to activations. GAPA replaces standard nonlinearities with Gaussian-process activations whose posterior mean exactly matches the original activation, preserving the backbone's point predictions by construction while providing closed-form epistemic variances in activation space. To scale to modern architectures, we use a sparse variational inducing-point approximation over cached training activations, combined with local k-nearest-neighbor subset conditioning, enabling deterministic single-pass uncertainty propagation without sampling, backpropagation, or second-order information. Across regression, classification, image segmentation, and language mode...

Read Original Article