[2509.21895] Why High-rank Neural Networks Generalize?: An Algebraic Framework with RKHSs
Summary
This paper explores an algebraic framework to explain why high-rank neural networks generalize effectively, deriving new Rademacher complexity bounds using advanced mathematical concepts.
Why It Matters
Understanding the generalization capabilities of high-rank neural networks is crucial for improving model performance in machine learning. This research provides a broader theoretical foundation, potentially impacting various applications in AI and machine learning, particularly in developing more robust models.
Key Takeaways
- Introduces a new Rademacher complexity bound for deep neural networks.
- Utilizes Koopman operators and RKHSs to derive generalization insights.
- Expands existing theoretical frameworks to encompass a wider range of models.
- Highlights the significance of high-rank weight matrices in model performance.
- Paves the way for future research in Koopman-based theories.
Computer Science > Machine Learning arXiv:2509.21895 (cs) [Submitted on 26 Sep 2025 (v1), last revised 24 Feb 2026 (this version, v2)] Title:Why High-rank Neural Networks Generalize?: An Algebraic Framework with RKHSs Authors:Yuka Hashimoto, Sho Sonoda, Isao Ishikawa, Masahiro Ikeda View a PDF of the paper titled Why High-rank Neural Networks Generalize?: An Algebraic Framework with RKHSs, by Yuka Hashimoto and 3 other authors View PDF HTML (experimental) Abstract:We derive a new Rademacher complexity bound for deep neural networks using Koopman operators, group representations, and reproducing kernel Hilbert spaces (RKHSs). The proposed bound describes why the models with high-rank weight matrices generalize well. Although there are existing bounds that attempt to describe this phenomenon, these existing bounds can be applied to limited types of models. We introduce an algebraic representation of neural networks and a kernel function to construct an RKHS to derive a bound for a wider range of realistic models. This work paves the way for the Koopman-based theory for Rademacher complexity bounds to be valid for more practical situations. Subjects: Machine Learning (cs.LG); Functional Analysis (math.FA); Representation Theory (math.RT); Machine Learning (stat.ML) Cite as: arXiv:2509.21895 [cs.LG] (or arXiv:2509.21895v2 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2509.21895 Focus to learn more arXiv-issued DOI via DataCite Journal reference: ICLR 2026 Submis...