[2603.24989] Learning Rollout from Sampling:An R1-Style Tokenized Traffic Simulation Model
About this article
Abstract page for arXiv paper 2603.24989: Learning Rollout from Sampling:An R1-Style Tokenized Traffic Simulation Model
Computer Science > Robotics arXiv:2603.24989 (cs) [Submitted on 26 Mar 2026] Title:Learning Rollout from Sampling:An R1-Style Tokenized Traffic Simulation Model Authors:Ziyan Wang, Peng Chen, Ding Li, Chiwei Li, Qichao Zhang, Zhongpu Xia, Guizhen Yu View a PDF of the paper titled Learning Rollout from Sampling:An R1-Style Tokenized Traffic Simulation Model, by Ziyan Wang and 6 other authors View PDF HTML (experimental) Abstract:Learning diverse and high-fidelity traffic simulations from human driving demonstrations is crucial for autonomous driving evaluation. The recent next-token prediction (NTP) paradigm, widely adopted in large language models (LLMs), has been applied to traffic simulation and achieves iterative improvements via supervised fine-tuning (SFT). However, such methods limit active exploration of potentially valuable motion tokens, particularly in suboptimal regions. Entropy patterns provide a promising perspective for enabling exploration driven by motion token uncertainty. Motivated by this insight, we propose a novel tokenized traffic simulation policy, R1Sim, which represents an initial attempt to explore reinforcement learning based on motion token entropy patterns, and systematically analyzes the impact of different motion tokens on simulation outcomes. Specifically, we introduce an entropy-guided adaptive sampling mechanism that focuses on previously overlooked motion tokens with high uncertainty yet high potential. We further optimize motion behavior...