[2604.01279] Sven: Singular Value Descent as a Computationally Efficient Natural Gradient Method
About this article
Abstract page for arXiv paper 2604.01279: Sven: Singular Value Descent as a Computationally Efficient Natural Gradient Method
Computer Science > Machine Learning arXiv:2604.01279 (cs) [Submitted on 1 Apr 2026] Title:Sven: Singular Value Descent as a Computationally Efficient Natural Gradient Method Authors:Samuel Bright-Thonney, Thomas R. Harvey, Andre Lukas, Jesse Thaler View a PDF of the paper titled Sven: Singular Value Descent as a Computationally Efficient Natural Gradient Method, by Samuel Bright-Thonney and 2 other authors View PDF HTML (experimental) Abstract:We introduce Sven (Singular Value dEsceNt), a new optimization algorithm for neural networks that exploits the natural decomposition of loss functions into a sum over individual data points, rather than reducing the full loss to a single scalar before computing a parameter update. Sven treats each data point's residual as a separate condition to be satisfied simultaneously, using the Moore-Penrose pseudoinverse of the loss Jacobian to find the minimum-norm parameter update that best satisfies all conditions at once. In practice, this pseudoinverse is approximated via a truncated singular value decomposition, retaining only the $k$ most significant directions and incurring a computational overhead of only a factor of $k$ relative to stochastic gradient descent. This is in comparison to traditional natural gradient methods, which scale as the square of the number of parameters. We show that Sven can be understood as a natural gradient method generalized to the over-parametrized regime, recovering natural gradient descent in the under-p...