[2603.02613] Real-Time Generative Policy via Langevin-Guided Flow Matching for Autonomous Driving
About this article
Abstract page for arXiv paper 2603.02613: Real-Time Generative Policy via Langevin-Guided Flow Matching for Autonomous Driving
Computer Science > Machine Learning arXiv:2603.02613 (cs) [Submitted on 3 Mar 2026] Title:Real-Time Generative Policy via Langevin-Guided Flow Matching for Autonomous Driving Authors:Tianze Zhu, Yinuo Wang, Wenjun Zou, Tianyi Zhang, Likun Wang, Letian Tao, Feihong Zhang, Yao Lyu, Shengbo Eben Li View a PDF of the paper titled Real-Time Generative Policy via Langevin-Guided Flow Matching for Autonomous Driving, by Tianze Zhu and 8 other authors View PDF HTML (experimental) Abstract:Reinforcement learning (RL) is a fundamental methodology in autonomous driving systems, where generative policies exhibit considerable potential by leveraging their ability to model complex distributions to enhance exploration. However, their inherent high inference latency severely impedes their deployment in real-time decision-making and control. To address this issue, we propose diffusion actor-critic with entropy regulator via flow matching (DACER-F) by introducing flow matching into online RL, enabling the generation of competitive actions in a single inference step. By leveraging Langevin dynamics and gradients of the Q-function, DACER-F dynamically optimizes actions from experience replay toward a target distribution that balances high Q-value information with exploratory behavior. The flow policy is then trained to efficiently learn a mapping from a simple prior distribution to this dynamic target. In complex multi-lane and intersection simulations, DACER-F outperforms baselines diffusion...