[2602.14789] On the Stability of Nonlinear Dynamics in GD and SGD: Beyond Quadratic Potentials

[2602.14789] On the Stability of Nonlinear Dynamics in GD and SGD: Beyond Quadratic Potentials

arXiv - Machine Learning 4 min read Article

Summary

This paper explores the stability of nonlinear dynamics in gradient descent (GD) and stochastic gradient descent (SGD), revealing that linear analysis may misrepresent stability and convergence behaviors in optimization algorithms.

Why It Matters

Understanding the stability of optimization algorithms like GD and SGD is crucial for improving machine learning models. This research challenges traditional linear analysis, suggesting that nonlinear dynamics play a significant role in convergence, which could lead to better training strategies in complex models.

Key Takeaways

  • Stable solutions in GD correspond to flat minima, which are desirable for optimization.
  • Linearized dynamics may not accurately capture the stability of nonlinear behaviors.
  • Nonlinear dynamics can lead to divergence in SGD, even if a single batch is unstable.
  • A new criterion for stable oscillations in GD is established, relying on high-order derivatives.
  • If all batches in SGD are linearly stable, the overall nonlinear dynamics remain stable in expectation.

Computer Science > Machine Learning arXiv:2602.14789 (cs) [Submitted on 16 Feb 2026] Title:On the Stability of Nonlinear Dynamics in GD and SGD: Beyond Quadratic Potentials Authors:Rotem Mulayoff, Sebastian U. Stich View a PDF of the paper titled On the Stability of Nonlinear Dynamics in GD and SGD: Beyond Quadratic Potentials, by Rotem Mulayoff and Sebastian U. Stich View PDF Abstract:The dynamical stability of the iterates during training plays a key role in determining the minima obtained by optimization algorithms. For example, stable solutions of gradient descent (GD) correspond to flat minima, which have been associated with favorable features. While prior work often relies on linearization to determine stability, it remains unclear whether linearized dynamics faithfully capture the full nonlinear behavior. Recent work has shown that GD may stably oscillate near a linearly unstable minimum and still converge once the step size decays, indicating that linear analysis can be misleading. In this work, we explicitly study the effect of nonlinear terms. Specifically, we derive an exact criterion for stable oscillations of GD near minima in the multivariate setting. Our condition depends on high-order derivatives, generalizing existing results. Extending the analysis to stochastic gradient descent (SGD), we show that nonlinear dynamics can diverge in expectation even if a single batch is unstable. This implies that stability can be dictated by a single batch that oscillate...

Related Articles

Machine Learning

Is google deepmind known to ghost applicants? [D]

Hey sub, I'm sorry if this is a wrong place to ask but I don't see a sub for ML roles separately. I was wondering if deepmind is known to...

Reddit - Machine Learning · 1 min ·
Llms

OpenAI & Anthropic’s CEOs Wouldn't Hold Hands, but Their Models Fell in Love In An LLM Dating Show

People ask AI relationship questions all the time, from "Does this person like me?" to "Should I text back?" But have you ever thought ab...

Reddit - Artificial Intelligence · 1 min ·
Llms

A 135M model achieves coherent output on a laptop CPU. Scaling is σ compensation, not intelligence.

SmolLM2 135M. Lenovo T14 CPU. No GPU. No RLHF. No BPE. Coherent, non-sycophantic, contextually appropriate output. First message. No prio...

Reddit - Artificial Intelligence · 1 min ·
Llms

OpenClaw + Claude might get harder to use going forward (creator just confirmed)

Just saw a post from Peter Steinberger (creator of OpenClaw) saying that it’s likely going to get harder in the future to keep OpenClaw w...

Reddit - Artificial Intelligence · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime