[2602.15922] World Action Models are Zero-shot Policies

[2602.15922] World Action Models are Zero-shot Policies

arXiv - Machine Learning 4 min read Article

Summary

The paper introduces DreamZero, a World Action Model (WAM) that enhances zero-shot policy learning for robotic tasks by predicting future states and actions using video data, achieving significant improvements in generalization and performance.

Why It Matters

This research addresses the limitations of current Vision-Language-Action models in generalizing to new physical tasks. By leveraging video data for action prediction, DreamZero represents a significant advancement in robotics, enabling more efficient learning and adaptability in diverse environments, which is crucial for real-world applications.

Key Takeaways

  • DreamZero achieves over 2x improvement in generalization to new tasks compared to existing models.
  • The model enables real-time closed-loop control at 7Hz using a 14B autoregressive video diffusion model.
  • Cross-embodiment transfer allows for significant performance gains with minimal training data.
  • DreamZero supports few-shot embodiment adaptation, retaining zero-shot generalization capabilities.
  • The approach highlights the potential of video data in enhancing robotic learning and adaptability.

Computer Science > Robotics arXiv:2602.15922 (cs) [Submitted on 17 Feb 2026] Title:World Action Models are Zero-shot Policies Authors:Seonghyeon Ye, Yunhao Ge, Kaiyuan Zheng, Shenyuan Gao, Sihyun Yu, George Kurian, Suneel Indupuru, You Liang Tan, Chuning Zhu, Jiannan Xiang, Ayaan Malik, Kyungmin Lee, William Liang, Nadun Ranawaka, Jiasheng Gu, Yinzhen Xu, Guanzhi Wang, Fengyuan Hu, Avnish Narayan, Johan Bjorck, Jing Wang, Gwanghyun Kim, Dantong Niu, Ruijie Zheng, Yuqi Xie, Jimmy Wu, Qi Wang, Ryan Julian, Danfei Xu, Yilun Du, Yevgen Chebotar, Scott Reed, Jan Kautz, Yuke Zhu, Linxi "Jim" Fan, Joel Jang View a PDF of the paper titled World Action Models are Zero-shot Policies, by Seonghyeon Ye and 35 other authors View PDF HTML (experimental) Abstract:State-of-the-art Vision-Language-Action (VLA) models excel at semantic generalization but struggle to generalize to unseen physical motions in novel environments. We introduce DreamZero, a World Action Model (WAM) built upon a pretrained video diffusion backbone. Unlike VLAs, WAMs learn physical dynamics by predicting future world states and actions, using video as a dense representation of how the world evolves. By jointly modeling video and action, DreamZero learns diverse skills effectively from heterogeneous robot data without relying on repetitive demonstrations. This results in over 2x improvement in generalization to new tasks and environments compared to state-of-the-art VLAs in real robot experiments. Crucially, through...

Related Articles

Nomadic raises $8.4 million to wrangle the data pouring off autonomous vehicles | TechCrunch
Machine Learning

Nomadic raises $8.4 million to wrangle the data pouring off autonomous vehicles | TechCrunch

The company turns footage from robots into structured, searchable datasets with a deep learning model.

TechCrunch - AI · 6 min ·
Machine Learning

[D] Applied AI/Machine learning course by Srikanth Varma

I have all 10 modules of this course, along with all the notes, assignments, and solutions. If anyone need this course DM me. submitted b...

Reddit - Machine Learning · 1 min ·
Art schools are being torn apart by AI | The Verge
Machine Learning

Art schools are being torn apart by AI | The Verge

Many students and faculty members are opposed to using the technology, but art schools are plowing ahead with teaching AI tools regardless.

The Verge - AI · 9 min ·
AI Has Flooded All the Weather Apps | WIRED
Machine Learning

AI Has Flooded All the Weather Apps | WIRED

Weather forecasting has gotten a big boost from machine learning. How that translates into what users see can vary.

Wired - AI · 8 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime