[2505.22475] Non-Asymptotic Analysis of (Sticky) Track-and-Stop
Summary
This paper presents a non-asymptotic analysis of the Sticky Track-and-Stop algorithm, extending its guarantees beyond asymptotic optimality in pure exploration problems.
Why It Matters
Understanding non-asymptotic guarantees for the Sticky Track-and-Stop algorithm is crucial for improving decision-making in environments with multiple correct answers. This research fills a significant gap in the literature, enhancing algorithmic performance in real-world applications where uncertainty is prevalent.
Key Takeaways
- The Sticky Track-and-Stop algorithm addresses multi-answer environments in pure exploration problems.
- Non-asymptotic guarantees for both Track-and-Stop algorithms are established, filling a critical research gap.
- The findings can improve algorithmic efficiency in various applications, including best-arm identification.
Computer Science > Machine Learning arXiv:2505.22475 (cs) [Submitted on 28 May 2025 (v1), last revised 18 Feb 2026 (this version, v2)] Title:Non-Asymptotic Analysis of (Sticky) Track-and-Stop Authors:Riccardo Poiani, Martino Bernasconi, Andrea Celli View a PDF of the paper titled Non-Asymptotic Analysis of (Sticky) Track-and-Stop, by Riccardo Poiani and 2 other authors View PDF HTML (experimental) Abstract:In pure exploration problems, a statistician sequentially collects information to answer a question about some stochastic and unknown environment. The probability of returning a wrong answer should not exceed a maximum risk parameter $\delta$ and good algorithms make as few queries to the environment as possible. The Track-and-Stop algorithm is a pioneering method to solve these problems. Specifically, it is well-known that it enjoys asymptotic optimality sample complexity guarantees for $\delta\to 0$ whenever the map from the environment to its correct answers is single-valued (e.g., best-arm identification with a unique optimal arm). The Sticky Track-and-Stop algorithm extends these results to settings where, for each environment, there might exist multiple correct answers (e.g., $\epsilon$-optimal arm identification). Although both methods are optimal in the asymptotic regime, their non-asymptotic guarantees remain unknown. In this work, we fill this gap and provide non-asymptotic guarantees for both algorithms. Subjects: Machine Learning (cs.LG) Cite as: arXiv:2505.2...