[2602.14656] An Embarrassingly Simple Way to Optimize Orthogonal Matrices at Scale
Summary
This paper presents POGO, a novel algorithm for optimizing orthogonal matrices efficiently, addressing scalability issues in machine learning applications.
Why It Matters
The ability to optimize orthogonal matrices is crucial in various machine learning contexts, especially where robustness and efficiency are required. POGO's advancements can significantly reduce computational costs and time, making it a valuable tool for researchers and practitioners dealing with large-scale problems.
Key Takeaways
- POGO optimizes orthogonal matrices with minimal computational cost.
- The algorithm maintains orthogonality throughout the optimization process.
- POGO significantly outperforms existing optimizers on large benchmarks.
- It reduces the number of required hyperparameters, simplifying the optimization process.
- A PyTorch implementation is publicly available for practical use.
Computer Science > Machine Learning arXiv:2602.14656 (cs) [Submitted on 16 Feb 2026] Title:An Embarrassingly Simple Way to Optimize Orthogonal Matrices at Scale Authors:Adrián Javaloy, Antonio Vergari View a PDF of the paper titled An Embarrassingly Simple Way to Optimize Orthogonal Matrices at Scale, by Adri\'an Javaloy and 1 other authors View PDF Abstract:Orthogonality constraints are ubiquitous in robust and probabilistic machine learning. Unfortunately, current optimizers are computationally expensive and do not scale to problems with hundreds or thousands of constraints. One notable exception is the Landing algorithm (Ablin et al., 2024) which, however comes at the expense of temporarily relaxing orthogonality. In this work, we revisit and improve on the ideas behind Landing, enabling the inclusion of modern adaptive optimizers while ensuring that orthogonal constraints are effectively met. Remarkably, these improvements come at little to no cost, and reduce the number of required hyperparemeters. Our algorithm POGO is fast and GPU-friendly, consisting of only 5 matrix products, and in practice maintains orthogonality at all times. On several challenging benchmarks, POGO greatly outperforms recent optimizers and shows it can optimize problems with thousands of orthogonal matrices in minutes while alternatives would take hours. As such, POGO sets a milestone to finally exploit orthogonality constraints in ML at scale. A PyTorch implementation of POGO is publicly avail...