[2602.08961] MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE
About this article
Abstract page for arXiv paper 2602.08961: MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE
Computer Science > Computer Vision and Pattern Recognition arXiv:2602.08961 (cs) [Submitted on 9 Feb 2026 (v1), last revised 28 Mar 2026 (this version, v2)] Title:MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE Authors:Ruijie Zhu, Jiahao Lu, Wenbo Hu, Xiaoguang Han, Jianfei Cai, Ying Shan, Chuanxia Zheng View a PDF of the paper titled MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE, by Ruijie Zhu and 6 other authors View PDF HTML (experimental) Abstract:We present MotionCrafter, a framework that leverages video generators to jointly reconstruct 4D geometry and estimate dense motion from a monocular video. The key idea is a joint representation of dense 3D point maps and 3D scene flows in a shared coordinate system, together with a 4D VAE tailored to learn this representation effectively. Unlike prior work that strictly aligns 3D values and latents with RGB VAE latents-despite their fundamentally different distributions-we show that such alignment is unnecessary and can hurt performance. Instead, we propose a new data normalization and VAE training strategy that better transfers diffusion priors and greatly improves reconstruction quality. Extensive experiments on multiple datasets show that MotionCrafter achieves state-of-the-art performance in both geometry reconstruction and dense scene flow estimation, delivering 38.64% and 25.0% improvements in geometry and motion reconstruction, respectively, all without any post-optimizatio...