[2603.21502] Quotient Geometry, Effective Curvature, and Implicit Bias in Simple Shallow Neural Networks

[2603.21502] Quotient Geometry, Effective Curvature, and Implicit Bias in Simple Shallow Neural Networks

arXiv - Machine Learning 4 min read

About this article

Abstract page for arXiv paper 2603.21502: Quotient Geometry, Effective Curvature, and Implicit Bias in Simple Shallow Neural Networks

Computer Science > Machine Learning arXiv:2603.21502 (cs) [Submitted on 23 Mar 2026] Title:Quotient Geometry, Effective Curvature, and Implicit Bias in Simple Shallow Neural Networks Authors:Hang-Cheng Dong, Pengcheng Cheng View a PDF of the paper titled Quotient Geometry, Effective Curvature, and Implicit Bias in Simple Shallow Neural Networks, by Hang-Cheng Dong and 1 other authors View PDF HTML (experimental) Abstract:Overparameterized shallow neural networks admit substantial parameter redundancy: distinct parameter vectors may represent the same predictor due to hidden-unit permutations, rescalings, and related symmetries. As a result, geometric quantities computed directly in the ambient Euclidean parameter space can reflect artifacts of representation rather than intrinsic properties of the predictor. In this paper, we develop a differential-geometric framework for analyzing simple shallow networks through the quotient space obtained by modding out parameter symmetries on a regular set. We first characterize the symmetry and quotient structure of regular shallow-network parameters and show that the finite-sample realization map induces a natural metric on the quotient manifold. This leads to an effective notion of curvature that removes degeneracy along symmetry orbits and yields a symmetry-reduced Hessian capturing intrinsic local geometry. We then study gradient flows on the quotient and show that only the horizontal component of parameter motion contributes to fi...

Originally published on March 24, 2026. Curated by AI News.

Related Articles

Machine Learning

[P] I tested Meta’s brain-response model on posts. It predicted the Elon one almost perfectly.

I built an experimental UI and visualization layer around Meta’s open brain-response model just to see whether this stuff actually works ...

Reddit - Machine Learning · 1 min ·
Machine Learning

[P] I trained an AI to play Resident Evil 4 Remake using Behavioral Cloning + LSTM

I recorded gameplay trajectories in RE4's village — running, shooting, reloading, dodging — and used Behavioral Cloning to train a model ...

Reddit - Machine Learning · 1 min ·
Machine Learning

[D] Why does it seem like open source materials on ML are incomplete? this is not enough...

Many times when I try to deeply understand a topic in machine learning — whether it's a new architecture, a quantization method, a full t...

Reddit - Machine Learning · 1 min ·
Llms

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

GPT-5.4-mini produces shorter, terser outputs by default. Vanilla accuracy dropped from 69.5% to 47.2% across 12 tasks (1,800 evals). The...

Reddit - Machine Learning · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime