[2603.02639] Convex and Non-convex Federated Learning with Stale

[2603.02639] Convex and Non-convex Federated Learning with Stale Stochastic Gradients: Diminishing Step Size is All You Need

arXiv - Machine Learning March 04, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.02639: Convex and Non-convex Federated Learning with Stale Stochastic Gradients: Diminishing Step Size is All You Need

Mathematics > Optimization and Control arXiv:2603.02639 (math) [Submitted on 3 Mar 2026] Title:Convex and Non-convex Federated Learning with Stale Stochastic Gradients: Diminishing Step Size is All You Need Authors:Xinran Zheng, Tara Javidi, Behrouz Touri View a PDF of the paper titled Convex and Non-convex Federated Learning with Stale Stochastic Gradients: Diminishing Step Size is All You Need, by Xinran Zheng and 2 other authors View PDF Abstract:We propose a general framework for distributed stochastic optimization under delayed gradient models. In this setting, $n$ local agents leverage their own data and computation to assist a central server in minimizing a global objective composed of agents' local cost functions. Each agent is allowed to transmit stochastic-potentially biased and delayed-estimates of its local gradient. While a prior work has advocated delay-adaptive step sizes for stochastic gradient descent (SGD) in the presence of delays, we demonstrate that a pre-chosen diminishing step size is sufficient and matches the performance of the adaptive scheme. Moreover, our analysis establishes that diminishing step sizes recover the optimal SGD rates for nonconvex and strongly convex objectives. Subjects: Optimization and Control (math.OC); Machine Learning (cs.LG) Cite as: arXiv:2603.02639 [math.OC] (or arXiv:2603.02639v1 [math.OC] for this version) https://doi.org/10.48550/arXiv.2603.02639 Focus to learn more arXiv-issued DOI via DataCite (pending registrat...

Originally published on March 04, 2026. Curated by AI News.

Machine Learning

[D] MXFP8 GEMM: Up to 99% of cuBLAS performance using CUDA + PTX

New blog post by Daniel Vega-Myhre (Meta/PyTorch) illustrating GEMM design for FP8, including deep-dives into all the constraints and des...

Reddit - Machine Learning · 1 min · 42 minutes ago

Machine Learning

IIT Delhi launches 8th batch of Advanced AI, ML, and DL online programme: Check who is eligible, applicat

News News: The Continuing Education Programme (CEP) at IIT Delhi has announced the launch of the 8th batch of its Advanced Certificate Pr...

AI News - General · 9 min · about 1 hour ago

Machine Learning

Chamco Digital Launches Microsoft AI and Cloud Technology Training Program with Board-Endorsed Strategic Expansion

Chamco Digital, a recognized Microsoft AI and Cloud Technology Partner, announced the launch of its globally accessible Microsoft AI and ...

AI News - General · 4 min · about 1 hour ago

Machine Learning

FPT Wins AI & Machine Learning Innovation Award at 2026 InsurInnovator Connect Asia Awards

HANOI, Vietnam--(BUSINESS WIRE)--Mar 30, 2026--

AI News - General · 13 min · about 1 hour ago

[2603.02639] Convex and Non-convex Federated Learning with Stale Stochastic Gradients: Diminishing Step Size is All You Need

About this article

Related Articles

[D] MXFP8 GEMM: Up to 99% of cuBLAS performance using CUDA + PTX

IIT Delhi launches 8th batch of Advanced AI, ML, and DL online programme: Check who is eligible, applicat

Chamco Digital Launches Microsoft AI and Cloud Technology Training Program with Board-Endorsed Strategic Expansion

FPT Wins AI & Machine Learning Innovation Award at 2026 InsurInnovator Connect Asia Awards

No comments

Stay updated with AI News