[2511.14617] Seer: Online Context Learning for Fast Synchronous LLM Reinforcement Learning
About this article
Abstract page for arXiv paper 2511.14617: Seer: Online Context Learning for Fast Synchronous LLM Reinforcement Learning
Computer Science > Distributed, Parallel, and Cluster Computing arXiv:2511.14617 (cs) [Submitted on 18 Nov 2025 (v1), last revised 3 Apr 2026 (this version, v3)] Title:Seer: Online Context Learning for Fast Synchronous LLM Reinforcement Learning Authors:Ruoyu Qin, Weiran He, Weixiao Huang, Yangkun Zhang, Yikai Zhao, Bo Pang, Xinran Xu, Yingdi Shan, Yongwei Wu, Mingxing Zhang View a PDF of the paper titled Seer: Online Context Learning for Fast Synchronous LLM Reinforcement Learning, by Ruoyu Qin and 9 other authors View PDF HTML (experimental) Abstract:Reinforcement Learning (RL) has emerged as a critical technique for advancing modern Large Language Models (LLMs), yet existing synchronous RL systems face severe performance bottlenecks. The rollout phase, which dominates end-to-end iteration time, suffers from substantial long-tail latency and poor resource utilization due to inherent workload imbalance. We present Seer, a novel context learning RL system that addresses these challenges through a key observation: requests sharing the same prompt exhibit strong similarities in output lengths and response patterns. Leveraging this insight, Seer introduces three coordinated techniques: (1) divided rollout for dynamic load balancing, (2) context-aware scheduling to mitigate long-tail request delays, and (3) adaptive grouped speculative decoding to accelerate generation. These mechanisms work in concert to markedly reduce long-tail latency and improve resource efficiency during...