[2604.04655] Grokking as Dimensional Phase Transition in Neural Networks

[2604.04655] Grokking as Dimensional Phase Transition in Neural Networks

arXiv - AI 3 min read

About this article

Abstract page for arXiv paper 2604.04655: Grokking as Dimensional Phase Transition in Neural Networks

Computer Science > Machine Learning arXiv:2604.04655 (cs) [Submitted on 6 Apr 2026] Title:Grokking as Dimensional Phase Transition in Neural Networks Authors:Ping Wang View a PDF of the paper titled Grokking as Dimensional Phase Transition in Neural Networks, by Ping Wang View PDF HTML (experimental) Abstract:Neural network grokking -- the abrupt memorization-to-generalization transition -- challenges our understanding of learning dynamics. Through finite-size scaling of gradient avalanche dynamics across eight model scales, we find that grokking is a \textit{dimensional phase transition}: effective dimensionality~$D$ crosses from sub-diffusive (subcritical, $D < 1$) to super-diffusive (supercritical, $D > 1$) at generalization onset, exhibiting self-organized criticality (SOC). Crucially, $D$ reflects \textbf{gradient field geometry}, not network architecture: synthetic i.i.d.\ Gaussian gradients maintain $D \approx 1$ regardless of graph topology, while real training exhibits dimensional excess from backpropagation correlations. The grokking-localized $D(t)$ crossing -- robust across topologies -- offers new insight into the trainability of overparameterized networks. Subjects: Machine Learning (cs.LG); Disordered Systems and Neural Networks (cond-mat.dis-nn); Artificial Intelligence (cs.AI); Adaptation and Self-Organizing Systems (nlin.AO) Cite as: arXiv:2604.04655 [cs.LG]   (or arXiv:2604.04655v1 [cs.LG] for this version)   https://doi.org/10.48550/arXiv.2604.04655 Foc...

Originally published on April 07, 2026. Curated by AI News.

Related Articles

Machine Learning

token budget is becoming part of my agent workflow design

I think token budget is becoming part of agent workflow design. If every run feels expensive, people under-test. They save quota, overthi...

Reddit - Artificial Intelligence · 1 min ·
Improving AI models’ ability to explain their predictions
Machine Learning

Improving AI models’ ability to explain their predictions

AI News - General · 9 min ·
New technique makes AI models leaner and faster while they’re still learning
Machine Learning

New technique makes AI models leaner and faster while they’re still learning

AI News - General · 9 min ·
Machine Learning

Question regarding Transformer's pipeline module [D]

from transformers import pipeline , DistilBertTokenizer , DistilBertModel model = DistilBertModel . from_pretrained ('distilbert-base-cas...

Reddit - Machine Learning · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime