[2502.03652] Improving the Convergence of Private Shuffled Gradient Methods with Public Data
Summary
This article presents a novel approach to improving the convergence of private shuffled gradient methods in machine learning by integrating public data, addressing the limitations of existing techniques.
Why It Matters
The research highlights the gap between theoretical and practical applications of differentially private algorithms in machine learning. By proposing a new method that combines private and public data, it offers a potential solution to enhance accuracy while maintaining privacy, which is crucial in sensitive data applications.
Key Takeaways
- Introduces Interleaved-ShuffleG, a hybrid method for optimization.
- Demonstrates that data shuffling can worsen empirical excess risk compared to traditional methods.
- Provides a new framework for analyzing privacy-accuracy trade-offs in machine learning.
Computer Science > Machine Learning arXiv:2502.03652 (cs) [Submitted on 5 Feb 2025 (v1), last revised 24 Feb 2026 (this version, v2)] Title:Improving the Convergence of Private Shuffled Gradient Methods with Public Data Authors:Shuli Jiang, Pranay Sharma, Zhiwei Steven Wu, Gauri Joshi View a PDF of the paper titled Improving the Convergence of Private Shuffled Gradient Methods with Public Data, by Shuli Jiang and 3 other authors View PDF Abstract:We consider the problem of differentially private (DP) convex empirical risk minimization (ERM). While the standard DP-SGD algorithm is theoretically well-established, practical implementations often rely on shuffled gradient methods that traverse the training data sequentially rather than sampling with replacement in each iteration. Despite their widespread use, the theoretical privacy-accuracy trade-offs of private shuffled gradient methods (\textit{DP-ShuffleG}) remain poorly understood, leading to a gap between theory and practice. In this work, we leverage privacy amplification by iteration (PABI) and a novel application of Stein's lemma to provide the first empirical excess risk bound of \textit{DP-ShuffleG}. Our result shows that data shuffling results in worse empirical excess risk for \textit{DP-ShuffleG} compared to DP-SGD. To address this limitation, we propose \textit{Interleaved-ShuffleG}, a hybrid approach that integrates public data samples in private optimization. By alternating optimization steps that use private ...