[2603.25029] Optimal High-Probability Regret for Online Convex

[2603.25029] Optimal High-Probability Regret for Online Convex Optimization with Two-Point Bandit Feedback

arXiv - Machine Learning March 27, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.25029: Optimal High-Probability Regret for Online Convex Optimization with Two-Point Bandit Feedback

Computer Science > Machine Learning arXiv:2603.25029 (cs) [Submitted on 26 Mar 2026] Title:Optimal High-Probability Regret for Online Convex Optimization with Two-Point Bandit Feedback Authors:Haishan Ye View a PDF of the paper titled Optimal High-Probability Regret for Online Convex Optimization with Two-Point Bandit Feedback, by Haishan Ye View PDF HTML (experimental) Abstract:We consider the problem of Online Convex Optimization (OCO) with two-point bandit feedback in an adversarial environment. In this setting, a player attempts to minimize a sequence of adversarially generated convex loss functions, while only observing the value of each function at two points. While it is well-known that two-point feedback allows for gradient estimation, achieving tight high-probability regret bounds for strongly convex functions still remained open as highlighted by \citet{agarwal2010optimal}. The primary challenge lies in the heavy-tailed nature of bandit gradient estimators, which makes standard concentration analysis difficult. In this paper, we resolve this open challenge by providing the first high-probability regret bound of $O(d(\log T + \log(1/\delta))/\mu)$ for $\mu$-strongly convex losses. Our result is minimax optimal with respect to both the time horizon $T$ and the dimension $d$. Subjects: Machine Learning (cs.LG) Cite as: arXiv:2603.25029 [cs.LG] (or arXiv:2603.25029v1 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2603.25029 Focus to learn more arXiv-iss...

Originally published on March 27, 2026. Curated by AI News.

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 3 hours ago

Machine Learning

[D] Looking for definition of open-world ish learning problem

Hello! Recently I did a project where I initially had around 30 target classes. But at inference, the model had to be able to handle a lo...

Reddit - Machine Learning · 1 min · about 3 hours ago

Machine Learning

[D] On conferences and page limitations

What is your opinion on long appendices in conference papers? I am observing that appendix lengths in conference papers (ICML, NeurIPS, e...

Reddit - Machine Learning · 1 min · about 5 hours ago

Llms

[2603.11413] Evaluation format, not model capability, drives triage failure in the assessment of consumer health AI

Abstract page for arXiv paper 2603.11413: Evaluation format, not model capability, drives triage failure in the assessment of consumer he...

arXiv - AI · 4 min · about 9 hours ago

[2603.25029] Optimal High-Probability Regret for Online Convex Optimization with Two-Point Bandit Feedback

About this article

Related Articles

UMKC Announces New Master of Science in Artificial Intelligence

[D] Looking for definition of open-world ish learning problem

[D] On conferences and page limitations

[2603.11413] Evaluation format, not model capability, drives triage failure in the assessment of consumer health AI

No comments

Stay updated with AI News