Machine Learning Ai Agents

[2602.15593] A unified theory of feature learning in RNNs and DNNs

arXiv - Machine Learning February 18, 2026 4 min read Article

Summary

This paper presents a unified theory of feature learning in recurrent neural networks (RNNs) and deep neural networks (DNNs), highlighting their structural similarities and distinct functional properties through a mean-field theory approach.

Why It Matters

Understanding the relationship between RNNs and DNNs is crucial for advancing machine learning techniques. This theory provides insights into how weight sharing in RNNs influences their performance on sequential tasks, potentially guiding future research and applications in AI.

Key Takeaways

RNNs and DNNs differ primarily in weight sharing, affecting their functional properties.
A unified mean-field theory connects architectural structure to functional biases in neural networks.
RNNs exhibit a phase transition in performance based on the learning signal relative to noise.
Weight sharing in RNNs aids generalization by interpolating unsupervised time steps.
The findings may influence future designs and training methods for neural networks.

Computer Science > Machine Learning arXiv:2602.15593 (cs) [Submitted on 17 Feb 2026] Title:A unified theory of feature learning in RNNs and DNNs Authors:Jan P. Bauer, Kirsten Fischer, Moritz Helias, Agostina Palmigiano View a PDF of the paper titled A unified theory of feature learning in RNNs and DNNs, by Jan P. Bauer and 3 other authors View PDF Abstract:Recurrent and deep neural networks (RNNs/DNNs) are cornerstone architectures in machine learning. Remarkably, RNNs differ from DNNs only by weight sharing, as can be shown through unrolling in time. How does this structural similarity fit with the distinct functional properties these networks exhibit? To address this question, we here develop a unified mean-field theory for RNNs and DNNs in terms of representational kernels, describing fully trained networks in the feature learning ($\mu$P) regime. This theory casts training as Bayesian inference over sequences and patterns, directly revealing the functional implications induced by the RNNs' weight sharing. In DNN-typical tasks, we identify a phase transition when the learning signal overcomes the noise due to randomness in the weights: below this threshold, RNNs and DNNs behave identically; above it, only RNNs develop correlated representations across timesteps. For sequential tasks, the RNNs' weight sharing furthermore induces an inductive bias that aids generalization by interpolating unsupervised time steps. Overall, our theory offers a way to connect architectural s...

Read Original Article

[2602.15593] A unified theory of feature learning in RNNs and DNNs

Summary

Why It Matters

Key Takeaways

Related Articles

OpenAI & Anthropic’s CEOs Wouldn't Hold Hands, but Their Models Fell in Love In An LLM Dating Show

A 135M model achieves coherent output on a laptop CPU. Scaling is σ compensation, not intelligence.

OpenClaw + Claude might get harder to use going forward (creator just confirmed)

[P] ibu-boost: a GBDT library where splits are absolutely rejected, not just relatively ranked[P]

No comments

Stay updated with AI News

[2602.15593] A unified theory of feature learning in RNNs and DNNs

Summary

Why It Matters

Key Takeaways

Related Articles

OpenAI & Anthropic’s CEOs Wouldn't Hold Hands, but Their Models Fell in Love In An LLM Dating Show

A 135M model achieves coherent output on a laptop CPU. Scaling is σ compensation, not intelligence.

OpenClaw + Claude might get harder to use going forward (creator just confirmed)

[P] ibu-boost: a GBDT library where splits are *absolutely* rejected, not just relatively ranked[P]

No comments

Stay updated with AI News

[P] ibu-boost: a GBDT library where splits are absolutely rejected, not just relatively ranked[P]