Machine Learning Ai Infrastructure Ai Agents

[2602.22925] Beyond NNGP: Large Deviations and Feature Learning in Bayesian Neural Networks

arXiv - Machine Learning February 27, 2026 3 min read Article

Summary

This paper explores the behavior of wide Bayesian neural networks, focusing on rare fluctuations that influence posterior concentration beyond Gaussian-process limits. It introduces large-deviation theory to enhance feature learning and presents numerical experiments validatin...

Why It Matters

Understanding the dynamics of Bayesian neural networks is crucial for advancing machine learning techniques. This research provides insights into the complexities of feature learning and offers a new perspective on optimizing network predictions, which could lead to improved model performance in various applications.

Key Takeaways

Introduces large-deviation theory to analyze Bayesian neural networks.
Highlights the importance of rare fluctuations in posterior concentration.
Demonstrates a joint optimization approach for predictors and internal kernels.
Validates findings through numerical experiments on finite-width networks.
Captures non-Gaussian behaviors and data-dependent kernel selection.

Statistics > Machine Learning arXiv:2602.22925 (stat) [Submitted on 26 Feb 2026] Title:Beyond NNGP: Large Deviations and Feature Learning in Bayesian Neural Networks Authors:Katerina Papagiannouli, Dario Trevisan, Giuseppe Pio Zitto View a PDF of the paper titled Beyond NNGP: Large Deviations and Feature Learning in Bayesian Neural Networks, by Katerina Papagiannouli and 2 other authors View PDF HTML (experimental) Abstract:We study wide Bayesian neural networks focusing on the rare but statistically dominant fluctuations that govern posterior concentration, beyond Gaussian-process limits. Large-deviation theory provides explicit variational objectives-rate functions-on predictors, providing an emerging notion of complexity and feature learning directly at the functional level. We show that the posterior output rate function is obtained by a joint optimization over predictors and internal kernels, in contrast with fixed-kernel (NNGP) theory. Numerical experiments demonstrate that the resulting predictions accurately describe finite-width behavior for moderately sized networks, capturing non-Gaussian tails, posterior deformation, and data-dependent kernel selection effects. Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG) ACM classes: G.3 Cite as: arXiv:2602.22925 [stat.ML] (or arXiv:2602.22925v1 [stat.ML] for this version) https://doi.org/10.48550/arXiv.2602.22925 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submission history ...

Read Original Article

[2602.22925] Beyond NNGP: Large Deviations and Feature Learning in Bayesian Neural Networks

Summary

Why It Matters

Key Takeaways

Related Articles

UMKC Announces New Master of Science in Artificial Intelligence

[D] Looking for definition of open-world ish learning problem

Mystery Shopping Meets Machine Learning: Can Algorithms Become the Ultimate Customer Experience Auditor?

GitHub to Use User Data for AI Training by Default

No comments

Stay updated with AI News