[2602.14474] One Good Source is All You Need: Near-Optimal Regret for Bandits under Heterogeneous Noise

arXiv - Machine Learning February 17, 2026 4 min read Article

Summary

The paper presents a novel algorithm, SOAR, for the K-armed Multiarmed Bandit problem that minimizes regret under heterogeneous noise, achieving near-optimal performance.

Why It Matters

This research addresses a significant challenge in machine learning regarding the selection of data sources with varying noise levels. By introducing SOAR, the authors provide a method that not only optimizes regret but also enhances the efficiency of decision-making in uncertain environments, which is crucial for applications in AI and data science.

Key Takeaways

SOAR algorithm effectively minimizes regret in K-armed bandit problems with heterogeneous noise.
The approach integrates variance-concentration bounds for efficient source selection.
SOAR achieves optimal instance-dependent regret similar to single-source MAB despite unknown variances.
Experimental results demonstrate SOAR's superiority over traditional baselines like Uniform UCB.
The findings have implications for real-world applications where data source variability is common.

Computer Science > Machine Learning arXiv:2602.14474 (cs) [Submitted on 16 Feb 2026] Title:One Good Source is All You Need: Near-Optimal Regret for Bandits under Heterogeneous Noise Authors:Aadirupa Saha, Amith Bhat, Haipeng Luo View a PDF of the paper titled One Good Source is All You Need: Near-Optimal Regret for Bandits under Heterogeneous Noise, by Aadirupa Saha and 2 other authors View PDF HTML (experimental) Abstract:We study $K$-armed Multiarmed Bandit (MAB) problem with $M$ heterogeneous data sources, each exhibiting unknown and distinct noise variances $\{\sigma_j^2\}_{j=1}^M$. The learner's objective is standard MAB regret minimization, with the additional complexity of adaptively selecting which data source to query from at each round. We propose Source-Optimistic Adaptive Regret minimization (SOAR), a novel algorithm that quickly prunes high-variance sources using sharp variance-concentration bounds, followed by a `balanced min-max LCB-UCB approach' that seamlessly integrates the parallel tasks of identifying the best arm and the optimal (minimum-variance) data source. Our analysis shows SOAR achieves an instance-dependent regret bound of $\tilde{O}\left({\sigma^*}^2\sum_{i=2}^K \frac{\log T}{\Delta_i} + \sqrt{K \sum_{j=1}^M \sigma_j^2}\right)$, up to preprocessing costs depending only on problem parameters, where ${\sigma^*}^2 := \min_j \sigma_j^2$ is the minimum source variance and $\Delta_i$ denotes the suboptimality gap of the $i$-th arm. This result is bot...

Read Original Article

[2602.14474] One Good Source is All You Need: Near-Optimal Regret for Bandits under Heterogeneous Noise

Summary

Why It Matters

Key Takeaways

Related Articles

Anthropic’s Mythos Will Force a Cybersecurity Reckoning—Just Not the One You Think | WIRED

Is google deepmind known to ghost applicants? [D]

OpenAI & Anthropic’s CEOs Wouldn't Hold Hands, but Their Models Fell in Love In An LLM Dating Show

A 135M model achieves coherent output on a laptop CPU. Scaling is σ compensation, not intelligence.

No comments

Stay updated with AI News