[2602.19859] Dirichlet Scale Mixture Priors for Bayesian Neural Networks
Summary
This article introduces Dirichlet Scale Mixture (DSM) priors for Bayesian Neural Networks, addressing limitations in interpretability and robustness while enhancing predictive performance and feature selection.
Why It Matters
The research is significant as it proposes a novel approach to improve Bayesian Neural Networks, which are crucial in machine learning. By introducing DSM priors, the study aims to enhance model interpretability, reduce overconfidence in predictions, and increase resilience against adversarial attacks, which are critical challenges in the field.
Key Takeaways
- Dirichlet Scale Mixture priors enhance interpretability and robustness in Bayesian Neural Networks.
- The proposed priors encourage sparsity and implicit feature selection, improving model efficiency.
- Experiments show DSM priors perform competitively with fewer parameters, especially in small data scenarios.
- Heavy-tailed shrinkage mechanisms mitigate the cold posterior effect, offering a robust alternative to Gaussian priors.
- The research highlights the importance of prior specification in Bayesian modeling.
Statistics > Machine Learning arXiv:2602.19859 (stat) [Submitted on 23 Feb 2026] Title:Dirichlet Scale Mixture Priors for Bayesian Neural Networks Authors:August Arnstad, Leiv Rønneberg, Geir Storvik View a PDF of the paper titled Dirichlet Scale Mixture Priors for Bayesian Neural Networks, by August Arnstad and Leiv R{\o}nneberg and Geir Storvik View PDF HTML (experimental) Abstract:Neural networks are the cornerstone of modern machine learning, yet can be difficult to interpret, give overconfident predictions and are vulnerable to adversarial attacks. Bayesian neural networks (BNNs) provide some alleviation of these limitations, but have problems of their own. The key step of specifying prior distributions in BNNs is no trivial task, yet is often skipped out of convenience. In this work, we propose a new class of prior distributions for BNNs, the Dirichlet scale mixture (DSM) prior, that addresses current limitations in Bayesian neural networks through structured, sparsity-inducing shrinkage. Theoretically, we derive general dependence structures and shrinkage results for DSM priors and show how they manifest under the geometry induced by neural networks. In experiments on simulated and real world data we find that the DSM priors encourages sparse networks through implicit feature selection, show robustness under adversarial attacks and deliver competitive predictive performance with substantially fewer effective parameters. In particular, their advantages appear most pr...