[2405.17573] Hamiltonian Mechanics of Feature Learning: Bottleneck Structure in Leaky ResNets
About this article
Abstract page for arXiv paper 2405.17573: Hamiltonian Mechanics of Feature Learning: Bottleneck Structure in Leaky ResNets
Statistics > Machine Learning arXiv:2405.17573 (stat) [Submitted on 27 May 2024 (v1), last revised 25 Mar 2026 (this version, v3)] Title:Hamiltonian Mechanics of Feature Learning: Bottleneck Structure in Leaky ResNets Authors:Arthur Jacot, Alexandre Kaiser View a PDF of the paper titled Hamiltonian Mechanics of Feature Learning: Bottleneck Structure in Leaky ResNets, by Arthur Jacot and 1 other authors View PDF HTML (experimental) Abstract:We study Leaky ResNets, which interpolate between ResNets and Fully-Connected nets depending on an 'effective depth' hyper-parameter $\tilde{L}$. In the infinite depth limit, we study 'representation geodesics' $A_{p}$: continuous paths in representation space (similar to NeuralODEs) from input $p=0$ to output $p=1$ that minimize the parameter norm of the network. We give a Lagrangian and Hamiltonian reformulation, which highlight the importance of two terms: a kinetic energy which favors small layer derivatives $\partial_{p}A_{p}$ and a potential energy that favors low-dimensional representations, as measured by the 'Cost of Identity'. The balance between these two forces offers an intuitive understanding of feature learning in ResNets. We leverage this intuition to explain the emergence of a bottleneck structure, as observed in previous work: for large $\tilde{L}$ the potential energy dominates and leads to a separation of timescales, where the representation jumps rapidly from the high dimensional inputs to a low-dimensional represent...