[2510.13044] SceneAdapt: Scene-aware Adaptation of Human Motion

[2510.13044] SceneAdapt: Scene-aware Adaptation of Human Motion Diffusion

arXiv - AI March 31, 2026 4 min read

About this article

Abstract page for arXiv paper 2510.13044: SceneAdapt: Scene-aware Adaptation of Human Motion Diffusion

Computer Science > Computer Vision and Pattern Recognition arXiv:2510.13044 (cs) [Submitted on 14 Oct 2025 (v1), last revised 30 Mar 2026 (this version, v2)] Title:SceneAdapt: Scene-aware Adaptation of Human Motion Diffusion Authors:Jungbin Cho, Minsu Kim, Jisoo Kim, Ce Zheng, Laszlo A. Jeni, Ming-Hsuan Yang, Youngjae Yu, Seonjoo Kim View a PDF of the paper titled SceneAdapt: Scene-aware Adaptation of Human Motion Diffusion, by Jungbin Cho and 7 other authors View PDF HTML (experimental) Abstract:Human motion is inherently diverse and semantically rich, while also shaped by the surrounding scene. However, existing motion generation approaches fail to generate semantically diverse motion while simultaneously respecting geometric scene constraints, since constructing large-scale datasets with both rich text-motion coverage and precise scene interactions is extremely challenging. In this work, we introduce SceneAdapt, a two-stage adaptation framework that enables semantically diverse, scene-aware human motion generation from text without large-scale paired text--scene--motion data. Our key idea is to use motion inbetweening, a learnable proxy task that requires no text, as a bridge between two disjoint resources: a text-motion dataset and a scene-motion dataset. By first adapting a text-to-motion model through inbetweening and then through scene-aware inbetweening, SceneAdapt injects geometric scene constraints into text-conditioned generation while preserving semantic divers...

Originally published on March 31, 2026. Curated by AI News.

Machine Learning

AI video generation seems fundamentally more expensive than text, not just less optimized

There’s been a lot of discussion recently about how expensive AI video generation is compared to text, and it feels like this is more tha...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Machine Learning

Accelerating science with AI and simulations

MIT Professor Rafael Gómez-Bombarelli discusses the transformative potential of AI in scientific research, emphasizing its role in materi...

AI News - General · 10 min · about 13 hours ago

Machine Learning

[2603.10202] Hybrid Hidden Markov Model for Modeling Equity Excess Growth Rate Dynamics: A Discrete-State Approach with Jump-Diffusion

Abstract page for arXiv paper 2603.10202: Hybrid Hidden Markov Model for Modeling Equity Excess Growth Rate Dynamics: A Discrete-State Ap...

arXiv - Machine Learning · 4 min · about 14 hours ago

Llms

[2602.00388] Safer by Diffusion, Broken by Context: Diffusion LLM's Safety Blessing and Its Failure Mode

Abstract page for arXiv paper 2602.00388: Safer by Diffusion, Broken by Context: Diffusion LLM's Safety Blessing and Its Failure Mode

arXiv - Machine Learning · 4 min · about 14 hours ago

[2510.13044] SceneAdapt: Scene-aware Adaptation of Human Motion Diffusion

About this article

Related Articles

AI video generation seems fundamentally more expensive than text, not just less optimized

Accelerating science with AI and simulations

[2603.10202] Hybrid Hidden Markov Model for Modeling Equity Excess Growth Rate Dynamics: A Discrete-State Approach with Jump-Diffusion

[2602.00388] Safer by Diffusion, Broken by Context: Diffusion LLM's Safety Blessing and Its Failure Mode

No comments

Stay updated with AI News