[2603.25024] Improving Infinitely Deep Bayesian Neural Networks with Nesterov's Accelerated Gradient Method
About this article
Abstract page for arXiv paper 2603.25024: Improving Infinitely Deep Bayesian Neural Networks with Nesterov's Accelerated Gradient Method
Statistics > Machine Learning arXiv:2603.25024 (stat) [Submitted on 26 Mar 2026] Title:Improving Infinitely Deep Bayesian Neural Networks with Nesterov's Accelerated Gradient Method Authors:Chenxu Yu, Wenqi Fang View a PDF of the paper titled Improving Infinitely Deep Bayesian Neural Networks with Nesterov's Accelerated Gradient Method, by Chenxu Yu and 1 other authors View PDF HTML (experimental) Abstract:As a representative continuous-depth neural network approach, stochastic differential equation (SDE)-based Bayesian neural networks (BNNs) have attracted considerable attention due to their solid theoretical foundations and strong potential for real-world applications. However, their reliance on numerical SDE solvers inevitably incurs a large number of function evaluations (NFEs), resulting in high computational cost and occasional convergence instability. To address these challenges, we propose a Nesterov-accelerated gradient (NAG) enhanced SDE-BNN model. By integrating NAG into the SDE-BNN framework along with an NFE-dependent residual skip connection, our method accelerates convergence and substantially reduces NFEs during both training and testing. Extensive empirical results show that our model consistently outperforms conventional SDE-BNNs across various tasks, including image classification and sequence modeling, achieving lower NFEs and improved predictive accuracy. Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG) Cite as: arXiv:2603.25024 [stat.ML]...