[2602.15538] Functional Central Limit Theorem for Stochastic Gradient Descent
Summary
This paper presents a functional central limit theorem for the trajectory of the stochastic gradient descent (SGD) algorithm applied to convex functions, highlighting its long-term fluctuations and diffusion limits.
Why It Matters
Understanding the asymptotic behavior of SGD is crucial for optimizing machine learning algorithms, especially in non-smooth settings. This research provides insights into the temporal structure of fluctuations, which can enhance the robustness and efficiency of SGD in practical applications.
Key Takeaways
- The paper proves a functional central limit theorem for SGD trajectories.
- It characterizes long-term fluctuations around the minimizer of convex functions.
- The results apply to non-smooth optimization problems, such as robust location estimation.
- This research contrasts with classical central limit theorems, offering a new perspective on SGD behavior.
- Understanding these fluctuations can improve the performance of machine learning algorithms.
Statistics > Machine Learning arXiv:2602.15538 (stat) [Submitted on 17 Feb 2026] Title:Functional Central Limit Theorem for Stochastic Gradient Descent Authors:Kessang Flamand, Victor-Emmanuel Brunel View a PDF of the paper titled Functional Central Limit Theorem for Stochastic Gradient Descent, by Kessang Flamand and Victor-Emmanuel Brunel View PDF HTML (experimental) Abstract:We study the asymptotic shape of the trajectory of the stochastic gradient descent algorithm applied to a convex objective function. Under mild regularity assumptions, we prove a functional central limit theorem for the properly rescaled trajectory. Our result characterizes the long-term fluctuations of the algorithm around the minimizer by providing a diffusion limit for the trajectory. In contrast with classical central limit theorems for the last iterate or Polyak-Ruppert averages, this functional result captures the temporal structure of the fluctuations and applies to non-smooth settings such as robust location estimation, including the geometric median. Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Optimization and Control (math.OC) Cite as: arXiv:2602.15538 [stat.ML] (or arXiv:2602.15538v1 [stat.ML] for this version) https://doi.org/10.48550/arXiv.2602.15538 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submission history From: Victor-Emmanuel Brunel [view email] [v1] Tue, 17 Feb 2026 12:42:19 UTC (3,991 KB) Full-text links: Access Paper: View ...