[2510.13044] SceneAdapt: Scene-aware Adaptation of Human Motion Diffusion
About this article
Abstract page for arXiv paper 2510.13044: SceneAdapt: Scene-aware Adaptation of Human Motion Diffusion
Computer Science > Computer Vision and Pattern Recognition arXiv:2510.13044 (cs) [Submitted on 14 Oct 2025 (v1), last revised 30 Mar 2026 (this version, v2)] Title:SceneAdapt: Scene-aware Adaptation of Human Motion Diffusion Authors:Jungbin Cho, Minsu Kim, Jisoo Kim, Ce Zheng, Laszlo A. Jeni, Ming-Hsuan Yang, Youngjae Yu, Seonjoo Kim View a PDF of the paper titled SceneAdapt: Scene-aware Adaptation of Human Motion Diffusion, by Jungbin Cho and 7 other authors View PDF HTML (experimental) Abstract:Human motion is inherently diverse and semantically rich, while also shaped by the surrounding scene. However, existing motion generation approaches fail to generate semantically diverse motion while simultaneously respecting geometric scene constraints, since constructing large-scale datasets with both rich text-motion coverage and precise scene interactions is extremely challenging. In this work, we introduce SceneAdapt, a two-stage adaptation framework that enables semantically diverse, scene-aware human motion generation from text without large-scale paired text--scene--motion data. Our key idea is to use motion inbetweening, a learnable proxy task that requires no text, as a bridge between two disjoint resources: a text-motion dataset and a scene-motion dataset. By first adapting a text-to-motion model through inbetweening and then through scene-aware inbetweening, SceneAdapt injects geometric scene constraints into text-conditioned generation while preserving semantic divers...