[2602.17525] Variational inference via radial transport
Summary
The paper introduces a novel approach to variational inference (VI) by optimizing radial profiles, enhancing the approximation of high-dimensional distributions beyond traditional Gaussian methods.
Why It Matters
This research is significant as it addresses the limitations of conventional variational inference techniques, which often fail to accurately capture complex distributions. By proposing a method that focuses on radial transport, it offers a more effective tool for practitioners in machine learning and statistics, potentially improving model performance in various applications.
Key Takeaways
- The proposed algorithm, radVI, enhances existing variational inference methods.
- It provides theoretical convergence guarantees based on recent optimization developments.
- Radial transport maps can improve the approximation of high-dimensional distributions.
- The method is a cost-effective add-on to traditional VI schemes like Gaussian VI.
- This approach may lead to better coverage in practical applications.
Computer Science > Machine Learning arXiv:2602.17525 (cs) [Submitted on 19 Feb 2026] Title:Variational inference via radial transport Authors:Luca Ghafourpour, Sinho Chewi, Alessio Figalli, Aram-Alexandre Pooladian View a PDF of the paper titled Variational inference via radial transport, by Luca Ghafourpour and 3 other authors View PDF HTML (experimental) Abstract:In variational inference (VI), the practitioner approximates a high-dimensional distribution $\pi$ with a simple surrogate one, often a (product) Gaussian distribution. However, in many cases of practical interest, Gaussian distributions might not capture the correct radial profile of $\pi$, resulting in poor coverage. In this work, we approach the VI problem from the perspective of optimizing over these radial profiles. Our algorithm radVI is a cheap, effective add-on to many existing VI schemes, such as Gaussian (mean-field) VI and Laplace approximation. We provide theoretical convergence guarantees for our algorithm, owing to recent developments in optimization over the Wasserstein space--the space of probability distributions endowed with the Wasserstein distance--and new regularity properties of radial transport maps in the style of Caffarelli (2000). Subjects: Machine Learning (cs.LG); Statistics Theory (math.ST); Machine Learning (stat.ML) Cite as: arXiv:2602.17525 [cs.LG] (or arXiv:2602.17525v1 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2602.17525 Focus to learn more arXiv-issued DOI vi...