[2602.21191] Statistical Query Lower Bounds for Smoothed Agnostic Learning
Summary
This paper presents a Statistical Query lower bound for smoothed agnostic learning, focusing on the complexity of learning halfspaces under Gaussian perturbations, establishing a significant theoretical result in machine learning.
Why It Matters
Understanding the complexity of smoothed agnostic learning is crucial for developing efficient algorithms in machine learning. This research provides foundational insights that can influence future work in learning theory and algorithm design, particularly in handling uncertainties in data.
Key Takeaways
- Introduces a Statistical Query lower bound for smoothed agnostic learning.
- Establishes the complexity of learning halfspaces under Gaussian noise.
- Demonstrates that existing upper bounds are close to optimal.
- Utilizes linear programming duality to find hard distributions.
- Contributes to the theoretical understanding of learning algorithms.
Computer Science > Machine Learning arXiv:2602.21191 (cs) [Submitted on 24 Feb 2026] Title:Statistical Query Lower Bounds for Smoothed Agnostic Learning Authors:Ilias Diakonikolas, Daniel M. Kane View a PDF of the paper titled Statistical Query Lower Bounds for Smoothed Agnostic Learning, by Ilias Diakonikolas and Daniel M. Kane View PDF HTML (experimental) Abstract:We study the complexity of smoothed agnostic learning, recently introduced by~\cite{CKKMS24}, in which the learner competes with the best classifier in a target class under slight Gaussian perturbations of the inputs. Specifically, we focus on the prototypical task of agnostically learning halfspaces under subgaussian distributions in the smoothed model. The best known upper bound for this problem relies on $L_1$-polynomial regression and has complexity $d^{\tilde{O}(1/\sigma^2) \log(1/\epsilon)}$, where $\sigma$ is the smoothing parameter and $\epsilon$ is the excess error. Our main result is a Statistical Query (SQ) lower bound providing formal evidence that this upper bound is close to best possible. In more detail, we show that (even for Gaussian marginals) any SQ algorithm for smoothed agnostic learning of halfspaces requires complexity $d^{\Omega(1/\sigma^{2}+\log(1/\epsilon))}$. This is the first non-trivial lower bound on the complexity of this task and nearly matches the known upper bound. Roughly speaking, we show that applying $L_1$-polynomial regression to a smoothed version of the function is essen...