[2602.15891] Learning to Drive in New Cities Without Human Demonstrations
Summary
This paper presents NOMAD, a novel approach for training autonomous vehicles to navigate new cities without relying on human driving demonstrations, using self-play reinforcement learning.
Why It Matters
As autonomous vehicles expand into new urban environments, the need for efficient adaptation methods is critical. NOMAD addresses the costly and time-consuming requirement for human demonstration data, offering a scalable solution that could accelerate the deployment of autonomous driving technologies in diverse settings.
Key Takeaways
- NOMAD enables policy adaptation for autonomous vehicles using only city maps and meta-information.
- The method significantly improves task success rates and trajectory realism in new cities.
- Self-play multi-agent reinforcement learning serves as an effective alternative to traditional data-intensive methods.
Computer Science > Robotics arXiv:2602.15891 (cs) [Submitted on 9 Feb 2026] Title:Learning to Drive in New Cities Without Human Demonstrations Authors:Zilin Wang, Saeed Rahmani, Daphne Cornelisse, Bidipta Sarkar, Alexander David Goldie, Jakob Nicolaus Foerster, Shimon Whiteson View a PDF of the paper titled Learning to Drive in New Cities Without Human Demonstrations, by Zilin Wang and 6 other authors View PDF HTML (experimental) Abstract:While autonomous vehicles have achieved reliable performance within specific operating regions, their deployment to new cities remains costly and slow. A key bottleneck is the need to collect many human demonstration trajectories when adapting driving policies to new cities that differ from those seen in training in terms of road geometry, traffic rules, and interaction patterns. In this paper, we show that self-play multi-agent reinforcement learning can adapt a driving policy to a substantially different target city using only the map and meta-information, without requiring any human demonstrations from that city. We introduce NO data Map-based self-play for Autonomous Driving (NOMAD), which enables policy adaptation in a simulator constructed based on the target-city map. Using a simple reward function, NOMAD substantially improves both task success rate and trajectory realism in target cities, demonstrating an effective and scalable alternative to data-intensive city-transfer methods. Project Page: this https URL Comments: Subjects: R...