[2602.20578] Upper-Linearizability of Online Non-Monotone DR-Submodular Maximization over Down-Closed Convex Sets
Summary
This paper explores online maximization of non-monotone Diminishing-Return submodular functions over down-closed convex sets, presenting a new structural result that enhances existing methods and improves static and dynamic regret bounds.
Why It Matters
The findings contribute to the field of online optimization, addressing limitations of current projection-free methods. By achieving improved regret bounds, this research has implications for various applications in machine learning and decision-making processes.
Key Takeaways
- Introduces a new structural result for online maximization of non-monotone DR-submodular functions.
- Achieves $O(T^{1/2})$ static regret with a single gradient query per round.
- Unlocks adaptive and dynamic regret guarantees across various feedback models.
- Improves state-of-the-art results in the field of online optimization.
- Provides insights into exponential reparametrization and surrogate potentials.
Computer Science > Machine Learning arXiv:2602.20578 (cs) [Submitted on 24 Feb 2026] Title:Upper-Linearizability of Online Non-Monotone DR-Submodular Maximization over Down-Closed Convex Sets Authors:Yiyang Lu, Haresh Jadav, Mohammad Pedramfar, Ranveer Singh, Vaneet Aggarwal View a PDF of the paper titled Upper-Linearizability of Online Non-Monotone DR-Submodular Maximization over Down-Closed Convex Sets, by Yiyang Lu and 4 other authors View PDF HTML (experimental) Abstract:We study online maximization of non-monotone Diminishing-Return(DR)-submodular functions over down-closed convex sets, a regime where existing projection-free online methods suffer from suboptimal regret and limited feedback guarantees. Our main contribution is a new structural result showing that this class is $1/e$-linearizable under carefully designed exponential reparametrization, scaling parameter, and surrogate potential, enabling a reduction to online linear optimization. As a result, we obtain $O(T^{1/2})$ static regret with a single gradient query per round and unlock adaptive and dynamic regret guarantees, together with improved rates under semi-bandit, bandit, and zeroth-order feedback. Across all feedback models, our bounds strictly improve the state of the art. Subjects: Machine Learning (cs.LG); Optimization and Control (math.OC); Machine Learning (stat.ML) Cite as: arXiv:2602.20578 [cs.LG] (or arXiv:2602.20578v1 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2602.20578 Focu...