[2506.23836] Proving the Limited Scalability of Centralized Distributed Optimization via a New Lower Bound Construction
About this article
Abstract page for arXiv paper 2506.23836: Proving the Limited Scalability of Centralized Distributed Optimization via a New Lower Bound Construction
Mathematics > Optimization and Control arXiv:2506.23836 (math) [Submitted on 30 Jun 2025 (v1), last revised 28 Mar 2026 (this version, v2)] Title:Proving the Limited Scalability of Centralized Distributed Optimization via a New Lower Bound Construction Authors:Alexander Tyurin View a PDF of the paper titled Proving the Limited Scalability of Centralized Distributed Optimization via a New Lower Bound Construction, by Alexander Tyurin View PDF HTML (experimental) Abstract:We consider centralized distributed optimization in the classical federated learning setup, where $n$ workers jointly find an $\varepsilon$-stationary point of an $L$-smooth, $d$-dimensional nonconvex function $f$, having access only to unbiased stochastic gradients with variance $\sigma^2$. Each worker requires at most $h$ seconds to compute a stochastic gradient, and the communication times from the server to the workers and from the workers to the server are $\tau_{s}$ and $\tau_{w}$ seconds per coordinate, respectively. One of the main motivations for distributed optimization is to achieve scalability with respect to $n$. For instance, it is well known that the distributed version of SGD has a variance-dependent runtime term $\frac{h \sigma^2 L \Delta}{n \varepsilon^2},$ which improves with the number of workers $n,$ where $\Delta = f(x^0) - f^*,$ and $x^0 \in R^d$ is the starting point. Similarly, using unbiased sparsification compressors, it is possible to reduce both the variance-dependent runtime te...