[2603.03226] Adaptive Methods Are Preferable in High Privacy Settings: An SDE Perspective
About this article
Abstract page for arXiv paper 2603.03226: Adaptive Methods Are Preferable in High Privacy Settings: An SDE Perspective
Computer Science > Machine Learning arXiv:2603.03226 (cs) [Submitted on 3 Mar 2026] Title:Adaptive Methods Are Preferable in High Privacy Settings: An SDE Perspective Authors:Enea Monzio Compagnoni, Alessandro Stanghellini, Rustem Islamov, Aurelien Lucchi, Anastasiia Koloskova View a PDF of the paper titled Adaptive Methods Are Preferable in High Privacy Settings: An SDE Perspective, by Enea Monzio Compagnoni and 4 other authors View PDF HTML (experimental) Abstract:Differential Privacy (DP) is becoming central to large-scale training as privacy regulations tighten. We revisit how DP noise interacts with adaptivity in optimization through the lens of stochastic differential equations, providing the first SDE-based analysis of private optimizers. Focusing on DP-SGD and DP-SignSGD under per-example clipping, we show a sharp contrast under fixed hyperparameters: DP-SGD converges at a Privacy-Utility Trade-Off of $\mathcal{O}(1/\varepsilon^2)$ with speed independent of $\varepsilon$, while DP-SignSGD converges at a speed linear in $\varepsilon$ with an $\mathcal{O}(1/\varepsilon)$ trade-off, dominating in high-privacy or large batch noise regimes. By contrast, under optimal learning rates, both methods achieve comparable theoretical asymptotic performance; however, the optimal learning rate of DP-SGD scales linearly with $\varepsilon$, while that of DP-SignSGD is essentially $\varepsilon$-independent. This makes adaptive methods far more practical, as their hyperparameters tra...