[2603.19979] X-World: Controllable Ego-Centric Multi-Camera World

[2603.19979] X-World: Controllable Ego-Centric Multi-Camera World Models for Scalable End-to-End Driving

arXiv - AI March 23, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.19979: X-World: Controllable Ego-Centric Multi-Camera World Models for Scalable End-to-End Driving

Computer Science > Computer Vision and Pattern Recognition arXiv:2603.19979 (cs) [Submitted on 20 Mar 2026] Title:X-World: Controllable Ego-Centric Multi-Camera World Models for Scalable End-to-End Driving Authors:Chaoda Zheng, Sean Li, Jinhao Deng, Zhennan Wang, Shijia Chen, Liqiang Xiao, Ziheng Chi, Hongbin Lin, Kangjie Chen, Boyang Wang, Yu Zhang, Xianming Liu View a PDF of the paper titled X-World: Controllable Ego-Centric Multi-Camera World Models for Scalable End-to-End Driving, by Chaoda Zheng and 11 other authors View PDF HTML (experimental) Abstract:Scalable and reliable evaluation is increasingly critical in the end-to-end era of autonomous driving, where vision--language--action (VLA) policies directly map raw sensor streams to driving actions. Yet, current evaluation pipelines still rely heavily on real-world road testing, which is costly, biased toward limited scenario coverage, and difficult to reproduce. These challenges motivate a real-world simulator that can generate realistic future observations under proposed actions, while remaining controllable and stable over long horizons. We present X-World, an action-conditioned multi-camera generative world model that simulates future observations directly in video space. Given synchronized multi-view camera history and a future action sequence, X-World generates future multi-camera video streams that follow the commanded actions. To ensure reproducible and editable scene rollouts, X-World further supports option...

Originally published on March 23, 2026. Curated by AI News.

Machine Learning

[P] I trained an AI to play Resident Evil 4 Remake using Behavioral Cloning + LSTM

I recorded gameplay trajectories in RE4's village — running, shooting, reloading, dodging — and used Behavioral Cloning to train a model ...

Reddit - Machine Learning · 1 min · about 1 hour ago

Machine Learning

[D] Why does it seem like open source materials on ML are incomplete? this is not enough...

Many times when I try to deeply understand a topic in machine learning — whether it's a new architecture, a quantization method, a full t...

Reddit - Machine Learning · 1 min · about 1 hour ago

Llms

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

GPT-5.4-mini produces shorter, terser outputs by default. Vanilla accuracy dropped from 69.5% to 47.2% across 12 tasks (1,800 evals). The...

Reddit - Machine Learning · 1 min · about 6 hours ago

Ai Startups

Top 10 AI certifications and courses for 2026

This article reviews the top 10 AI certifications and courses for 2026, highlighting their significance in a rapidly evolving field and t...

AI Events · 15 min · about 6 hours ago

[2603.19979] X-World: Controllable Ego-Centric Multi-Camera World Models for Scalable End-to-End Driving

About this article

Related Articles

[P] I trained an AI to play Resident Evil 4 Remake using Behavioral Cloning + LSTM

[D] Why does it seem like open source materials on ML are incomplete? this is not enough...

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

Top 10 AI certifications and courses for 2026

No comments

Stay updated with AI News