[2602.22936] Generalization Bounds of Stochastic Gradient Descent in Homogeneous Neural Networks

arXiv - Machine Learning February 27, 2026 3 min read Article

Summary

This paper explores generalization bounds for Stochastic Gradient Descent (SGD) in homogeneous neural networks, revealing that slower stepsize decay can enhance optimization under certain conditions.

Why It Matters

Understanding generalization bounds is crucial for improving the performance of machine learning models. This research provides insights into optimizing SGD, which is widely used in training neural networks, thereby potentially enhancing model accuracy and efficiency.

Key Takeaways

Algorithmic stability is essential for generalization analysis in machine learning.
The study shows that slower stepsize decay can be effective in non-convex training scenarios.
Findings are applicable to various neural network architectures, including fully-connected and convolutional networks.

Computer Science > Machine Learning arXiv:2602.22936 (cs) [Submitted on 26 Feb 2026] Title:Generalization Bounds of Stochastic Gradient Descent in Homogeneous Neural Networks Authors:Wenquan Ma, Yang Sui, Jiaye Teng, Bohan Wang, Jing Xu, Jingqin Yang View a PDF of the paper titled Generalization Bounds of Stochastic Gradient Descent in Homogeneous Neural Networks, by Wenquan Ma and 5 other authors View PDF HTML (experimental) Abstract:Algorithmic stability is among the most potent techniques in generalization analysis. However, its derivation usually requires a stepsize $\eta_t = \mathcal{O}(1/t)$ under non-convex training regimes, where $t$ denotes iterations. This rigid decay of the stepsize potentially impedes optimization and may not align with practical scenarios. In this paper, we derive the generalization bounds under the homogeneous neural network regimes, proving that this regime enables slower stepsize decay of order $\Omega(1/\sqrt{t})$ under mild assumptions. We further extend the theoretical results from several aspects, e.g., non-Lipschitz regimes. This finding is broadly applicable, as homogeneous neural networks encompass fully-connected and convolutional neural networks with ReLU and LeakyReLU activations. Subjects: Machine Learning (cs.LG) Cite as: arXiv:2602.22936 [cs.LG] (or arXiv:2602.22936v1 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2602.22936 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submission histor...

Read Original Article

[2602.22936] Generalization Bounds of Stochastic Gradient Descent in Homogeneous Neural Networks

Summary

Why It Matters

Key Takeaways

Related Articles

UMKC Announces New Master of Science in Artificial Intelligence

[D] Looking for definition of open-world ish learning problem

Mystery Shopping Meets Machine Learning: Can Algorithms Become the Ultimate Customer Experience Auditor?

GitHub to Use User Data for AI Training by Default

No comments

Stay updated with AI News