[2507.00629] Generalization performance of narrow one-hidden layer networks in the teacher-student setting
About this article
Abstract page for arXiv paper 2507.00629: Generalization performance of narrow one-hidden layer networks in the teacher-student setting
Condensed Matter > Disordered Systems and Neural Networks arXiv:2507.00629 (cond-mat) [Submitted on 1 Jul 2025 (v1), last revised 25 Mar 2026 (this version, v4)] Title:Generalization performance of narrow one-hidden layer networks in the teacher-student setting Authors:Rodrigo Pérez Ortiz, Gibbs Nwemadji, Jean Barbier, Federica Gerace, Alessandro Ingrosso, Clarissa Lauditi, Enrico M. Malatesta View a PDF of the paper titled Generalization performance of narrow one-hidden layer networks in the teacher-student setting, by Rodrigo P\'erez Ortiz and 6 other authors View PDF HTML (experimental) Abstract:Understanding the generalization properties of neural networks on simple input-output distributions is key to explaining their performance on real datasets. The classical teacher-student setting, where a network is trained on data generated by a teacher model, provides a canonical theoretical test bed. In this context, a complete theoretical characterization of fully connected one-hidden-layer networks with generic activation functions remains missing. In this work, we develop a general framework for such networks with large width, yet much smaller than the input dimension. Using methods from statistical physics, we derive closed-form expressions for the typical performance of both finite-temperature (Bayesian) and empirical risk minimization estimators in terms of a small number of order parameters. We uncover a transition to a specialization phase, where hidden neurons align w...