[2512.13123] Stopping Rules for Stochastic Gradient Descent via Anytime-Valid Confidence Sequences
Summary
This paper presents a novel framework for determining stopping rules in Stochastic Gradient Descent (SGD) using anytime-valid confidence sequences, addressing a critical gap in existing optimization methods.
Why It Matters
The ability to stop SGD based on real-time performance metrics can significantly enhance computational efficiency and resource management in machine learning applications. This research provides a statistically valid approach to optimize stopping criteria, which is crucial for both convex and nonconvex optimization problems.
Key Takeaways
- Introduces anytime-valid confidence sequences for SGD stopping rules.
- Validates stopping criteria based on observed trajectories without prior optimization horizon knowledge.
- Applies to both convex and nonconvex optimization scenarios.
- Enhances computational efficiency by allowing timely termination of SGD.
- Characterizes stopping-time complexity under standard stepsize schedules.
Mathematics > Optimization and Control arXiv:2512.13123 (math) [Submitted on 15 Dec 2025 (v1), last revised 20 Feb 2026 (this version, v5)] Title:Stopping Rules for Stochastic Gradient Descent via Anytime-Valid Confidence Sequences Authors:Liviu Aolaritei, Michael I. Jordan View a PDF of the paper titled Stopping Rules for Stochastic Gradient Descent via Anytime-Valid Confidence Sequences, by Liviu Aolaritei and 1 other authors View PDF HTML (experimental) Abstract:The problem of stopping stochastic gradient descent (SGD) in an online manner, based solely on the observed trajectory, is a challenging theoretical problem with significant consequences for applications. While SGD is routinely monitored as it runs, the classical theory of SGD provides guarantees only at pre-specified iteration horizons and offers no valid way to decide, based on the observed trajectory, when further computation is justified. We address this longstanding gap by developing anytime-valid confidence sequences for stochastic gradient methods, which remain valid under continuous monitoring and directly induce statistically valid, trajectory-dependent stopping rules: stop as soon as the current upper confidence bound on an appropriate performance measure falls below a user-specified tolerance. The confidence sequences are constructed using nonnegative supermartingales, are time-uniform, and depend only on observable quantities along the SGD trajectory, without requiring prior knowledge of the optimiza...