[2603.19979] X-World: Controllable Ego-Centric Multi-Camera World Models for Scalable End-to-End Driving

[2603.19979] X-World: Controllable Ego-Centric Multi-Camera World Models for Scalable End-to-End Driving

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2603.19979: X-World: Controllable Ego-Centric Multi-Camera World Models for Scalable End-to-End Driving

Computer Science > Computer Vision and Pattern Recognition arXiv:2603.19979 (cs) [Submitted on 20 Mar 2026] Title:X-World: Controllable Ego-Centric Multi-Camera World Models for Scalable End-to-End Driving Authors:Chaoda Zheng, Sean Li, Jinhao Deng, Zhennan Wang, Shijia Chen, Liqiang Xiao, Ziheng Chi, Hongbin Lin, Kangjie Chen, Boyang Wang, Yu Zhang, Xianming Liu View a PDF of the paper titled X-World: Controllable Ego-Centric Multi-Camera World Models for Scalable End-to-End Driving, by Chaoda Zheng and 11 other authors View PDF HTML (experimental) Abstract:Scalable and reliable evaluation is increasingly critical in the end-to-end era of autonomous driving, where vision--language--action (VLA) policies directly map raw sensor streams to driving actions. Yet, current evaluation pipelines still rely heavily on real-world road testing, which is costly, biased toward limited scenario coverage, and difficult to reproduce. These challenges motivate a real-world simulator that can generate realistic future observations under proposed actions, while remaining controllable and stable over long horizons. We present X-World, an action-conditioned multi-camera generative world model that simulates future observations directly in video space. Given synchronized multi-view camera history and a future action sequence, X-World generates future multi-camera video streams that follow the commanded actions. To ensure reproducible and editable scene rollouts, X-World further supports option...

Originally published on March 23, 2026. Curated by AI News.

Related Articles

Machine Learning

[P] I trained an AI to play Resident Evil 4 Remake using Behavioral Cloning + LSTM

I recorded gameplay trajectories in RE4's village — running, shooting, reloading, dodging — and used Behavioral Cloning to train a model ...

Reddit - Machine Learning · 1 min ·
Machine Learning

[D] Why does it seem like open source materials on ML are incomplete? this is not enough...

Many times when I try to deeply understand a topic in machine learning — whether it's a new architecture, a quantization method, a full t...

Reddit - Machine Learning · 1 min ·
Llms

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

GPT-5.4-mini produces shorter, terser outputs by default. Vanilla accuracy dropped from 69.5% to 47.2% across 12 tasks (1,800 evals). The...

Reddit - Machine Learning · 1 min ·
Top 10 AI certifications and courses for 2026
Ai Startups

Top 10 AI certifications and courses for 2026

This article reviews the top 10 AI certifications and courses for 2026, highlighting their significance in a rapidly evolving field and t...

AI Events · 15 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime