[2505.11985] Variance-Optimal Arm Selection: Misallocation Minimization and Best Arm Identification

[2505.11985] Variance-Optimal Arm Selection: Misallocation Minimization and Best Arm Identification

arXiv - Machine Learning 4 min read Article

Summary

This paper presents novel algorithms for selecting the arm with the highest variance among independent arms, focusing on misallocation minimization and best arm identification.

Why It Matters

The study addresses critical challenges in multi-armed bandit problems, particularly in optimizing decision-making processes in uncertain environments. By introducing new algorithms, it enhances the efficiency of identifying optimal choices, which is vital in various applications like finance and machine learning.

Key Takeaways

  • Introduces UCB-VV and SHVV algorithms for variance-optimal arm selection.
  • Demonstrates that UCB-VV achieves order optimality in misallocation minimization.
  • SHVV shows superior performance in fixed-budget best arm identification.
  • Extends theoretical frameworks to sub-Gaussian distributions with new concentration inequalities.
  • Empirical results indicate UCB-VV outperforms traditional methods like epsilon-greedy.

Computer Science > Machine Learning arXiv:2505.11985 (cs) [Submitted on 17 May 2025 (v1), last revised 17 Feb 2026 (this version, v3)] Title:Variance-Optimal Arm Selection: Misallocation Minimization and Best Arm Identification Authors:Sabrina Khurshid, Gourab Ghatak, Mohammad Shahid Abdulla View a PDF of the paper titled Variance-Optimal Arm Selection: Misallocation Minimization and Best Arm Identification, by Sabrina Khurshid and 2 other authors View PDF HTML (experimental) Abstract:This paper focuses on selecting the arm with the highest variance from a set of $K$ independent arms. Specifically, we focus on two settings: (i) misallocation minimization setting, that penalizes the number of pulls of suboptimal arms in terms of variance, and (ii) fixed-budget best arm identification setting, that evaluates the ability of an algorithm to determine the arm with the highest variance after a fixed number of pulls. We develop a novel online algorithm called UCB-VV for the misallocation minimization (MM) and show that its upper bound on misallocation for bounded rewards evolves as $\mathcal{O}\left(\log{n}\right)$ where $n$ is the horizon. By deriving the lower bound on the misallocation, we show that UCB-VV is order optimal. For the fixed budget best arm identification (BAI) setting we propose the SHVV algorithm. We show that the upper bound of the error probability of SHVV evolves as $\exp\left(-\frac{n}{\log(K) H}\right)$, where $H$ represents the complexity of the problem, a...

Related Articles

UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Sam Altman's Coworkers Say He Can Barely Code and Misunderstands Basic Machine Learning Concepts
Machine Learning

Sam Altman's Coworkers Say He Can Barely Code and Misunderstands Basic Machine Learning Concepts

AI News - General · 2 min ·
Interpretable machine learning model advances analysis of complex genetic traits
Machine Learning

Interpretable machine learning model advances analysis of complex genetic traits

AI News - General · 6 min ·
Why AI Is Training on Its Own Garbage (and How to Fix It)
Machine Learning

Why AI Is Training on Its Own Garbage (and How to Fix It)

AI News - General · 8 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime