[2604.09527] Envisioning the Future, One Step at a Time
About this article
Abstract page for arXiv paper 2604.09527: Envisioning the Future, One Step at a Time
Computer Science > Computer Vision and Pattern Recognition arXiv:2604.09527 (cs) [Submitted on 10 Apr 2026] Title:Envisioning the Future, One Step at a Time Authors:Stefan Andreas Baumann, Jannik Wiese, Tommaso Martorella, Mahdi M. Kalayeh, Björn Ommer View a PDF of the paper titled Envisioning the Future, One Step at a Time, by Stefan Andreas Baumann and 4 other authors View PDF HTML (experimental) Abstract:Accurately anticipating how complex, diverse scenes will evolve requires models that represent uncertainty, simulate along extended interaction chains, and efficiently explore many plausible futures. Yet most existing approaches rely on dense video or latent-space prediction, expending substantial capacity on dense appearance rather than on the underlying sparse trajectories of points in the scene. This makes large-scale exploration of future hypotheses costly and limits performance when long-horizon, multi-modal motion is essential. We address this by formulating the prediction of open-set future scene dynamics as step-wise inference over sparse point trajectories. Our autoregressive diffusion model advances these trajectories through short, locally predictable transitions, explicitly modeling the growth of uncertainty over time. This dynamics-centric representation enables fast rollout of thousands of diverse futures from a single image, optionally guided by initial constraints on motion, while maintaining physical plausibility and long-range coherence. We further in...