[2510.08713] Towards Unified World Models for Visual Navigation via Memory-Augmented Planning and Foresight

[2510.08713] Towards Unified World Models for Visual Navigation via Memory-Augmented Planning and Foresight

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2510.08713: Towards Unified World Models for Visual Navigation via Memory-Augmented Planning and Foresight

Computer Science > Artificial Intelligence arXiv:2510.08713 (cs) [Submitted on 9 Oct 2025 (v1), last revised 22 Mar 2026 (this version, v2)] Title:Towards Unified World Models for Visual Navigation via Memory-Augmented Planning and Foresight Authors:Yifei Dong, Fengyi Wu, Guangyu Chen, Lingdong Kong, Xu Zhu, Qiyu Hu, Yuxuan Zhou, Jingdong Sun, Jun-Yan He, Qi Dai, Alexander G. Hauptmann, Zhi-Qi Cheng View a PDF of the paper titled Towards Unified World Models for Visual Navigation via Memory-Augmented Planning and Foresight, by Yifei Dong and 11 other authors View PDF HTML (experimental) Abstract:Enabling embodied agents to imagine future states is essential for robust and generalizable visual navigation. Yet, state-of-the-art systems typically rely on modular designs that decouple navigation planning from visual world modeling, which often induces state-action misalignment and weak adaptability in novel or dynamic scenarios. We propose UniWM, a unified, memory-augmented world model that integrates egocentric visual foresight and planning within a single multimodal autoregressive backbone. UniWM explicitly grounds action selection in visually imagined outcomes, tightly aligning prediction with control. Meanwhile, a hierarchical memory mechanism fuses short-term perceptual cues with longer-term trajectory context, supporting stable and coherent reasoning over extended horizons. Extensive experiments on four challenging benchmarks (Go Stanford, ReCon, SCAND, HuRoN) and the 1X...

Originally published on March 24, 2026. Curated by AI News.

Related Articles

Llms

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

GPT-5.4-mini produces shorter, terser outputs by default. Vanilla accuracy dropped from 69.5% to 47.2% across 12 tasks (1,800 evals). The...

Reddit - Machine Learning · 1 min ·
Top 10 AI certifications and courses for 2026
Ai Startups

Top 10 AI certifications and courses for 2026

This article reviews the top 10 AI certifications and courses for 2026, highlighting their significance in a rapidly evolving field and t...

AI Events · 15 min ·
Hub Group Using AI, Machine Learning for Real-Time Visibility of Shipments
Machine Learning

Hub Group Using AI, Machine Learning for Real-Time Visibility of Shipments

Hub Group says it’s using artificial intelligence and machine learning to leverage data from its GPS-equipped container fleet to give cus...

AI Events · 4 min ·
UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime