[2602.10014] A Task-Centric Theory for Iterative Self-Improvement with Easy-to-Hard Curricula
About this article
Abstract page for arXiv paper 2602.10014: A Task-Centric Theory for Iterative Self-Improvement with Easy-to-Hard Curricula
Computer Science > Machine Learning arXiv:2602.10014 (cs) [Submitted on 10 Feb 2026 (v1), last revised 19 Mar 2026 (this version, v2)] Title:A Task-Centric Theory for Iterative Self-Improvement with Easy-to-Hard Curricula Authors:Chenruo Liu, Yijun Dong, Yiqiu Shen, Qi Lei View a PDF of the paper titled A Task-Centric Theory for Iterative Self-Improvement with Easy-to-Hard Curricula, by Chenruo Liu and 3 other authors View PDF HTML (experimental) Abstract:Iterative self-improvement fine-tunes an autoregressive large language model (LLM) on reward-verified outputs generated by the LLM itself. In contrast to the empirical success of self-improvement, the theoretical foundation of this generative, iterative procedure in a practical, finite-sample setting remains limited. We make progress toward this goal by modeling each round of self-improvement as maximum-likelihood fine-tuning on a reward-filtered distribution and deriving finite-sample guarantees for the expected reward. Our analysis reveals an explicit feedback loop where better models accept more data per iteration, supporting sustained self-improvement while explaining eventual saturation of such improvement. Adopting a task-centric view by considering reasoning tasks with multiple difficulty levels, we further prove quantifiable conditions on model initialization, task difficulty, and sample budget where easy-to-hard curricula provably achieve better guarantees than training on fixed mixtures of tasks. Our analyses ar...