[2604.00260] Learning to Shuffle: Block Reshuffling and Reversal Schemes for Stochastic Optimization
About this article
Abstract page for arXiv paper 2604.00260: Learning to Shuffle: Block Reshuffling and Reversal Schemes for Stochastic Optimization
Computer Science > Machine Learning arXiv:2604.00260 (cs) [Submitted on 31 Mar 2026] Title:Learning to Shuffle: Block Reshuffling and Reversal Schemes for Stochastic Optimization Authors:Lam M. Nguyen, Dzung T. Phan, Jayant Kalagnanam View a PDF of the paper titled Learning to Shuffle: Block Reshuffling and Reversal Schemes for Stochastic Optimization, by Lam M. Nguyen and 2 other authors View PDF HTML (experimental) Abstract:Shuffling strategies for stochastic gradient descent (SGD), including incremental gradient, shuffle-once, and random reshuffling, are supported by rigorous convergence analyses for arbitrary within-epoch permutations. In particular, random reshuffling is known to improve optimization constants relative to cyclic and shuffle-once schemes. However, existing theory offers limited guidance on how to design new data-ordering schemes that further improve optimization constants or stability beyond random reshuffling. In this paper, we design a pipeline using a large language model (LLM)-guided program evolution framework to discover an effective shuffling rule for without-replacement SGD. Abstracting from this instance, we identify two fundamental structural components: block reshuffling and paired reversal. We analyze these components separately and show that block reshuffling strictly reduces prefix-gradient variance constants within the unified shuffling framework, yielding provable improvements over random reshuffling under mild conditions. Separately, w...