[2505.24298] AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning
About this article
Abstract page for arXiv paper 2505.24298: AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning
Computer Science > Machine Learning arXiv:2505.24298 (cs) [Submitted on 30 May 2025 (v1), last revised 2 Mar 2026 (this version, v5)] Title:AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning Authors:Wei Fu, Jiaxuan Gao, Xujie Shen, Chen Zhu, Zhiyu Mei, Chuyi He, Shusheng Xu, Guo Wei, Jun Mei, Jiashu Wang, Tongkai Yang, Binhang Yuan, Yi Wu View a PDF of the paper titled AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning, by Wei Fu and 12 other authors View PDF HTML (experimental) Abstract:Reinforcement learning (RL) has become a dominant paradigm for training large language models (LLMs), particularly for reasoning tasks. Effective RL for LLMs requires massive parallelization and poses an urgent need for efficient training systems. Most existing large-scale RL systems for LLMs are synchronous, alternating generation and training in a batch setting where rollouts in each training batch are generated by the same model. This approach stabilizes RL training but suffers from severe system-level inefficiency: generation must wait until the longest output in the batch is completed before model updates, resulting in GPU underutilization. We present AReaL, a fully asynchronous RL system that completely decouples generation from training. Rollout workers in AReaL continuously generate new outputs without waiting, while training workers update the model whenever a batch of data is collected. AReaL also incorporate...