[2503.19874] Extensions of the regret-minimization algorithm for optimal design
Summary
This article presents extensions to the regret-minimization algorithm aimed at optimal design for multiclass classification, proposing a new sample selection objective and demonstrating improved performance on benchmark datasets.
Why It Matters
The advancements in regret-minimization algorithms are crucial for enhancing the efficiency of sample selection in machine learning, particularly in multiclass classification tasks. This research contributes to the ongoing efforts to optimize experimental design, which is vital for improving model accuracy and reducing labeling costs.
Key Takeaways
- Introduces a new regularization scheme within the regret-minimization framework.
- Proposes a sample selection objective with a provable sample complexity bound.
- Demonstrates consistent performance improvements over state-of-the-art methods on datasets like MNIST and CIFAR-10.
- Extends the algorithm's applicability to ridge regression experimental design.
- Provides a theoretical foundation for achieving $(1+ ext{ε})$-approximate solutions.
Computer Science > Machine Learning arXiv:2503.19874 (cs) [Submitted on 25 Mar 2025 (v1), last revised 25 Feb 2026 (this version, v2)] Title:Extensions of the regret-minimization algorithm for optimal design Authors:Youguang Chen, George Biros View a PDF of the paper titled Extensions of the regret-minimization algorithm for optimal design, by Youguang Chen and George Biros View PDF Abstract:We consider the problem of selecting a subset of points from a dataset of $n$ unlabeled examples for labeling, with the goal of training a multiclass classifier. To address this, we build upon the regret minimization framework introduced by Allen-Zhu et al. in "Near-optimal design of experiments via regret minimization" (ICML, 2017). We propose an alternative regularization scheme within this framework, which leads to a new sample selection objective along with a provable sample complexity bound that guarantees a $(1+\epsilon)$-approximate solution. Additionally, we extend the regret minimization approach to handle experimental design in the ridge regression setting. We evaluate the selected samples using logistic regression and compare performance against several state-of-the-art methods. Our empirical results on MNIST, CIFAR-10, and a 50-class subset of ImageNet demonstrate that our method consistently outperforms competing approaches across most scenarios. Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML) MSC classes: 62J12, 62L05, 68W27, 68W40, 68T05 Cite as: arXiv:250...