[2604.04655] Grokking as Dimensional Phase Transition in Neural

[2604.04655] Grokking as Dimensional Phase Transition in Neural Networks

arXiv - AI April 07, 2026 3 min read

About this article

Abstract page for arXiv paper 2604.04655: Grokking as Dimensional Phase Transition in Neural Networks

Computer Science > Machine Learning arXiv:2604.04655 (cs) [Submitted on 6 Apr 2026] Title:Grokking as Dimensional Phase Transition in Neural Networks Authors:Ping Wang View a PDF of the paper titled Grokking as Dimensional Phase Transition in Neural Networks, by Ping Wang View PDF HTML (experimental) Abstract:Neural network grokking -- the abrupt memorization-to-generalization transition -- challenges our understanding of learning dynamics. Through finite-size scaling of gradient avalanche dynamics across eight model scales, we find that grokking is a \textit{dimensional phase transition}: effective dimensionality~$D$ crosses from sub-diffusive (subcritical, $D < 1$) to super-diffusive (supercritical, $D > 1$) at generalization onset, exhibiting self-organized criticality (SOC). Crucially, $D$ reflects \textbf{gradient field geometry}, not network architecture: synthetic i.i.d.\ Gaussian gradients maintain $D \approx 1$ regardless of graph topology, while real training exhibits dimensional excess from backpropagation correlations. The grokking-localized $D(t)$ crossing -- robust across topologies -- offers new insight into the trainability of overparameterized networks. Subjects: Machine Learning (cs.LG); Disordered Systems and Neural Networks (cond-mat.dis-nn); Artificial Intelligence (cs.AI); Adaptation and Self-Organizing Systems (nlin.AO) Cite as: arXiv:2604.04655 [cs.LG] (or arXiv:2604.04655v1 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2604.04655 Foc...

Originally published on April 07, 2026. Curated by AI News.

Machine Learning

token budget is becoming part of my agent workflow design

I think token budget is becoming part of agent workflow design. If every run feels expensive, people under-test. They save quota, overthi...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Machine Learning

Improving AI models’ ability to explain their predictions

AI News - General · 9 min · about 2 hours ago

Machine Learning

New technique makes AI models leaner and faster while they’re still learning

AI News - General · 9 min · about 2 hours ago

Machine Learning

Question regarding Transformer's pipeline module [D]

from transformers import pipeline , DistilBertTokenizer , DistilBertModel model = DistilBertModel . from_pretrained ('distilbert-base-cas...

Reddit - Machine Learning · 1 min · about 3 hours ago

[2604.04655] Grokking as Dimensional Phase Transition in Neural Networks

About this article

Related Articles

token budget is becoming part of my agent workflow design

Improving AI models’ ability to explain their predictions

New technique makes AI models leaner and faster while they’re still learning

Question regarding Transformer's pipeline module [D]

No comments

Stay updated with AI News