[2602.20403] Wasserstein Distributionally Robust Online Learning
Summary
This paper explores Wasserstein distributionally robust online learning, addressing challenges in convergence and computation through a novel framework and tailored algorithm.
Why It Matters
The study enhances understanding of online learning in machine learning, particularly in risk-averse scenarios. By bridging theoretical concepts with practical algorithms, it offers significant improvements in computational efficiency, which is crucial for real-world applications.
Key Takeaways
- Introduces a framework for online saddle-point stochastic games in distributionally robust learning.
- Proposes an algorithm that significantly speeds up solving worst-case expectation problems.
- Establishes a connection between infinite-dimensional optimization and budget allocation problems.
Computer Science > Machine Learning arXiv:2602.20403 (cs) [Submitted on 23 Feb 2026] Title:Wasserstein Distributionally Robust Online Learning Authors:Guixian Chen, Salar Fattahi, Soroosh Shafiee View a PDF of the paper titled Wasserstein Distributionally Robust Online Learning, by Guixian Chen and 2 other authors View PDF Abstract:We study distributionally robust online learning, where a risk-averse learner updates decisions sequentially to guard against worst-case distributions drawn from a Wasserstein ambiguity set centered at past observations. While this paradigm is well understood in the offline setting through Wasserstein Distributionally Robust Optimization (DRO), its online extension poses significant challenges in both convergence and computation. In this paper, we address these challenges. First, we formulate the problem as an online saddle-point stochastic game between a decision maker and an adversary selecting worst-case distributions, and propose a general framework that converges to a robust Nash equilibrium coinciding with the solution of the corresponding offline Wasserstein DRO problem. Second, we address the main computational bottleneck, which is the repeated solution of worst-case expectation problems. For the important class of piecewise concave loss functions, we propose a tailored algorithm that exploits problem geometry to achieve substantial speedups over state-of-the-art solvers such as Gurobi. The key insight is a novel connection between the w...