[2604.00505] Towards Initialization-dependent and Non-vacuous Generalization Bounds for Overparameterized Shallow Neural Networks

[2604.00505] Towards Initialization-dependent and Non-vacuous Generalization Bounds for Overparameterized Shallow Neural Networks

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2604.00505: Towards Initialization-dependent and Non-vacuous Generalization Bounds for Overparameterized Shallow Neural Networks

Computer Science > Machine Learning arXiv:2604.00505 (cs) [Submitted on 1 Apr 2026] Title:Towards Initialization-dependent and Non-vacuous Generalization Bounds for Overparameterized Shallow Neural Networks Authors:Yunwen Lei, Yufeng Xie View a PDF of the paper titled Towards Initialization-dependent and Non-vacuous Generalization Bounds for Overparameterized Shallow Neural Networks, by Yunwen Lei and Yufeng Xie View PDF HTML (experimental) Abstract:Overparameterized neural networks often show a benign overfitting property in the sense of achieving excellent generalization behavior despite the number of parameters exceeding the number of training examples. A promising direction to explain benign overfitting is to relate generalization to the norm of distance from initialization, motivated by the empirical observations that this distance is often significantly smaller than the norm itself. However, the existing initialization-dependent complexity analyses cannot fully exploit the power of initialization since the associated bounds depend on the spectral norm of the initialization matrix, which can scale as a square-root function of the width and are therefore not effective for overparameterized models. In this paper, we develop the first \emph{fully} initialization-dependent complexity bounds for shallow neural networks with general Lipschitz activation functions, which enjoys a logarithmic dependency on the width. Our bounds depend on the path-norm of the distance from ini...

Originally published on April 02, 2026. Curated by AI News.

Related Articles

Machine Learning

[D] ICML final justification

Do we get notified if any reviewer put their final justification into their original review comment? submitted by /u/tuejan11 [link] [com...

Reddit - Machine Learning · 1 min ·
Anthropic debuts preview of powerful new AI model Mythos in new cybersecurity initiative | TechCrunch
Machine Learning

Anthropic debuts preview of powerful new AI model Mythos in new cybersecurity initiative | TechCrunch

The new model will be used by a small number of high-profile companies to engage in defensive cybersecurity work.

TechCrunch - AI · 5 min ·
Anthropic debuts ‘Project Glasswing’ and new AI model for cybersecurity | The Verge
Machine Learning

Anthropic debuts ‘Project Glasswing’ and new AI model for cybersecurity | The Verge

Anthropic launched Project Glasswing, a cybersecurity initiative in which it’s partnering with Nvidia, Apple, and others, and debuted a n...

The Verge - AI · 5 min ·
Machine Learning

FYI the Tennessee bill makes making an AI friend the same level as murder or aggravated rape

I think what Tennessee is doing is they recently passed SB 1580, which makes it illegal to even advertise that an AI can act as a mental ...

Reddit - Artificial Intelligence · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime