[2510.01037] CurES: From Gradient Analysis to Efficient Curriculum Learning for Reasoning LLMs
About this article
Abstract page for arXiv paper 2510.01037: CurES: From Gradient Analysis to Efficient Curriculum Learning for Reasoning LLMs
Computer Science > Machine Learning arXiv:2510.01037 (cs) [Submitted on 1 Oct 2025 (v1), last revised 23 Mar 2026 (this version, v2)] Title:CurES: From Gradient Analysis to Efficient Curriculum Learning for Reasoning LLMs Authors:Yongcheng Zeng, Zexu Sun, Bokai Ji, Erxue Min, Hengyi Cai, Shuaiqiang Wang, Dawei Yin, Haifeng Zhang, Xu Chen, Jun Wang View a PDF of the paper titled CurES: From Gradient Analysis to Efficient Curriculum Learning for Reasoning LLMs, by Yongcheng Zeng and 9 other authors View PDF HTML (experimental) Abstract:Curriculum learning plays a crucial role in enhancing the training efficiency of large language models (LLMs) on reasoning tasks. However, existing methods often fail to adequately account for variations in prompt difficulty or rely on simplistic filtering mechanisms to select prompt datasets within a narrow criterion range, resulting in significant computational waste. In this work, we approach the problem from the perspective of reinforcement learning gradient optimization, offering a systematic and theoretical investigation into how to improve the training efficiency of LLMs. We identify two key factors influencing training efficiency: the selection of training prompts and the allocation of rollout quantities across different prompts. Our theoretical analysis reveals that the sampling distribution of prompts dictates the convergence rate of gradient descent, while the allocation of the rollout quantity influences the consistency and stabili...