[2604.04891] Muon Dynamics as a Spectral Wasserstein Flow

[2604.04891] Muon Dynamics as a Spectral Wasserstein Flow

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2604.04891: Muon Dynamics as a Spectral Wasserstein Flow

Mathematics > Optimization and Control arXiv:2604.04891 (math) [Submitted on 6 Apr 2026] Title:Muon Dynamics as a Spectral Wasserstein Flow Authors:Gabriel Peyré View a PDF of the paper titled Muon Dynamics as a Spectral Wasserstein Flow, by Gabriel Peyr\'e View PDF HTML (experimental) Abstract:Gradient normalization is central in deep-learning optimization because it stabilizes training and reduces sensitivity to scale. For deep architectures, parameters are naturally grouped into matrices or blocks, so spectral normalizations are often more faithful than coordinatewise Euclidean ones; Muon is the main motivating example of this paper. More broadly, we study a family of spectral normalization rules, ranging from ordinary gradient descent to Muon and intermediate Schatten-type schemes, in a mean-field regime where parameters are modeled by probability measures. We introduce a family of Spectral Wasserstein distances indexed by a norm gamma on positive semidefinite matrices. The trace norm recovers the classical quadratic Wasserstein distance, the operator norm recovers the Muon geometry, and intermediate Schatten norms interpolate between them. We develop the static Kantorovich formulation, prove comparison bounds with W2, derive a max-min representation, and obtain a conditional Brenier theorem. For Gaussian marginals, the problem reduces to a constrained optimization on covariance matrices, extending the Bures formula and yielding a closed form for commuting covariances ...

Originally published on April 07, 2026. Curated by AI News.

Related Articles

Machine Learning

AeroJAX: JAX-native CFD, differentiable end-to-end. ~560 FPS at 128x128 on CPU [P]

I have been building a JAX based CFD framework for differentiable Navier Stokes simulation inside ML loops such as inverse design and lea...

Reddit - Machine Learning · 1 min ·
Larry Ellison’s betting everything on OpenAI. Will it pay off or pop the bubble? | The Verge
Llms

Larry Ellison’s betting everything on OpenAI. Will it pay off or pop the bubble? | The Verge

Larry Ellison and Oracle have staked their future on a data center deal with OpenAI and a big bet that enterprise AI will pay off.

The Verge - AI · 32 min ·
Machine Learning

Am I crazy to think that the UAI authors are confusing the discussion deadline with the rebuttal deadline ? [D]

Hello everyone. UAI review results were released last Thursday, and the discussion period was clearly stated as April 23 to May 2nd. Howe...

Reddit - Machine Learning · 1 min ·
GitHub rushed to fix a critical vulnerability in less than six hours | The Verge
Machine Learning

GitHub rushed to fix a critical vulnerability in less than six hours | The Verge

A critical remote code execution vulnerability was discovered using an AI model and patched within hours.

The Verge - AI · 4 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime