[2603.02639] Convex and Non-convex Federated Learning with Stale Stochastic Gradients: Diminishing Step Size is All You Need

[2603.02639] Convex and Non-convex Federated Learning with Stale Stochastic Gradients: Diminishing Step Size is All You Need

arXiv - Machine Learning 3 min read

About this article

Abstract page for arXiv paper 2603.02639: Convex and Non-convex Federated Learning with Stale Stochastic Gradients: Diminishing Step Size is All You Need

Mathematics > Optimization and Control arXiv:2603.02639 (math) [Submitted on 3 Mar 2026] Title:Convex and Non-convex Federated Learning with Stale Stochastic Gradients: Diminishing Step Size is All You Need Authors:Xinran Zheng, Tara Javidi, Behrouz Touri View a PDF of the paper titled Convex and Non-convex Federated Learning with Stale Stochastic Gradients: Diminishing Step Size is All You Need, by Xinran Zheng and 2 other authors View PDF Abstract:We propose a general framework for distributed stochastic optimization under delayed gradient models. In this setting, $n$ local agents leverage their own data and computation to assist a central server in minimizing a global objective composed of agents' local cost functions. Each agent is allowed to transmit stochastic-potentially biased and delayed-estimates of its local gradient. While a prior work has advocated delay-adaptive step sizes for stochastic gradient descent (SGD) in the presence of delays, we demonstrate that a pre-chosen diminishing step size is sufficient and matches the performance of the adaptive scheme. Moreover, our analysis establishes that diminishing step sizes recover the optimal SGD rates for nonconvex and strongly convex objectives. Subjects: Optimization and Control (math.OC); Machine Learning (cs.LG) Cite as: arXiv:2603.02639 [math.OC]   (or arXiv:2603.02639v1 [math.OC] for this version)   https://doi.org/10.48550/arXiv.2603.02639 Focus to learn more arXiv-issued DOI via DataCite (pending registrat...

Originally published on March 04, 2026. Curated by AI News.

Related Articles

Machine Learning

[D] MXFP8 GEMM: Up to 99% of cuBLAS performance using CUDA + PTX

New blog post by Daniel Vega-Myhre (Meta/PyTorch) illustrating GEMM design for FP8, including deep-dives into all the constraints and des...

Reddit - Machine Learning · 1 min ·
IIT Delhi launches 8th batch of Advanced AI, ML, and DL online programme: Check who is eligible, applicat
Machine Learning

IIT Delhi launches 8th batch of Advanced AI, ML, and DL online programme: Check who is eligible, applicat

News News: The Continuing Education Programme (CEP) at IIT Delhi has announced the launch of the 8th batch of its Advanced Certificate Pr...

AI News - General · 9 min ·
Chamco Digital Launches Microsoft AI and Cloud Technology Training Program with Board-Endorsed Strategic Expansion
Machine Learning

Chamco Digital Launches Microsoft AI and Cloud Technology Training Program with Board-Endorsed Strategic Expansion

Chamco Digital, a recognized Microsoft AI and Cloud Technology Partner, announced the launch of its globally accessible Microsoft AI and ...

AI News - General · 4 min ·
FPT Wins AI & Machine Learning Innovation Award at 2026 InsurInnovator Connect Asia Awards
Machine Learning

FPT Wins AI & Machine Learning Innovation Award at 2026 InsurInnovator Connect Asia Awards

HANOI, Vietnam--(BUSINESS WIRE)--Mar 30, 2026--

AI News - General · 13 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime