[2507.00629] Generalization performance of narrow one-hidden layer

[2507.00629] Generalization performance of narrow one-hidden layer networks in the teacher-student setting

arXiv - Machine Learning March 26, 2026 4 min read

About this article

Abstract page for arXiv paper 2507.00629: Generalization performance of narrow one-hidden layer networks in the teacher-student setting

Condensed Matter > Disordered Systems and Neural Networks arXiv:2507.00629 (cond-mat) [Submitted on 1 Jul 2025 (v1), last revised 25 Mar 2026 (this version, v4)] Title:Generalization performance of narrow one-hidden layer networks in the teacher-student setting Authors:Rodrigo Pérez Ortiz, Gibbs Nwemadji, Jean Barbier, Federica Gerace, Alessandro Ingrosso, Clarissa Lauditi, Enrico M. Malatesta View a PDF of the paper titled Generalization performance of narrow one-hidden layer networks in the teacher-student setting, by Rodrigo P\'erez Ortiz and 6 other authors View PDF HTML (experimental) Abstract:Understanding the generalization properties of neural networks on simple input-output distributions is key to explaining their performance on real datasets. The classical teacher-student setting, where a network is trained on data generated by a teacher model, provides a canonical theoretical test bed. In this context, a complete theoretical characterization of fully connected one-hidden-layer networks with generic activation functions remains missing. In this work, we develop a general framework for such networks with large width, yet much smaller than the input dimension. Using methods from statistical physics, we derive closed-form expressions for the typical performance of both finite-temperature (Bayesian) and empirical risk minimization estimators in terms of a small number of order parameters. We uncover a transition to a specialization phase, where hidden neurons align w...

Originally published on March 26, 2026. Curated by AI News.

Llms

[P] ClaudeFormer: Building a Transformer Out of Claudes — Collaboration Request

I'm looking to work with people interested in math, machine learning, or agentic coding, on creating a multi-agent framework to do fronti...

Reddit - Machine Learning · 1 min · about 1 hour ago

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 4 hours ago

Machine Learning

[D] Looking for definition of open-world ish learning problem

Hello! Recently I did a project where I initially had around 30 target classes. But at inference, the model had to be able to handle a lo...

Reddit - Machine Learning · 1 min · about 5 hours ago

Machine Learning

Mystery Shopping Meets Machine Learning: Can Algorithms Become the Ultimate Customer Experience Auditor?

Customer expectations across Africa are shifting faster than most organisations can track. A single inconsistent interaction can ignite a...

AI News - General · 8 min · about 5 hours ago

[2507.00629] Generalization performance of narrow one-hidden layer networks in the teacher-student setting

About this article

Related Articles

[P] ClaudeFormer: Building a Transformer Out of Claudes — Collaboration Request

UMKC Announces New Master of Science in Artificial Intelligence

[D] Looking for definition of open-world ish learning problem

Mystery Shopping Meets Machine Learning: Can Algorithms Become the Ultimate Customer Experience Auditor?

No comments

Stay updated with AI News