[2511.01266] MotionStream: Real-Time Video Generation with Interactive

[2511.01266] MotionStream: Real-Time Video Generation with Interactive Motion Controls

arXiv - Machine Learning March 03, 2026 4 min read

About this article

Abstract page for arXiv paper 2511.01266: MotionStream: Real-Time Video Generation with Interactive Motion Controls

Computer Science > Computer Vision and Pattern Recognition arXiv:2511.01266 (cs) [Submitted on 3 Nov 2025 (v1), last revised 1 Mar 2026 (this version, v3)] Title:MotionStream: Real-Time Video Generation with Interactive Motion Controls Authors:Joonghyuk Shin, Zhengqi Li, Richard Zhang, Jun-Yan Zhu, Jaesik Park, Eli Shechtman, Xun Huang View a PDF of the paper titled MotionStream: Real-Time Video Generation with Interactive Motion Controls, by Joonghyuk Shin and 6 other authors View PDF HTML (experimental) Abstract:Current motion-conditioned video generation methods suffer from prohibitive latency (minutes per video) and non-causal processing that prevents real-time interaction. We present MotionStream, enabling sub-second latency with up to 29 FPS streaming generation on a single GPU. Our approach begins by augmenting a text-to-video model with motion control, which generates high-quality videos that adhere to the global text prompt and local motion guidance, but does not perform inference on the fly. As such, we distill this bidirectional teacher into a causal student through Self Forcing with Distribution Matching Distillation, enabling real-time streaming inference. Several key challenges arise when generating videos of long, potentially infinite time-horizons -- (1) bridging the domain gap from training on finite length and extrapolating to infinite horizons, (2) sustaining high quality by preventing error accumulation, and (3) maintaining fast inference, without incur...

Originally published on March 03, 2026. Curated by AI News.

Machine Learning

[D] Got my first offer after months of searching — below posted range, contract-to-hire, and worried it may pause my search. Do I take it?

I could really use some outside perspective. I’m a senior ML/CV engineer in Canada with about 5–6 years across research and industry. Mas...

Reddit - Machine Learning · 1 min · 17 minutes ago

Machine Learning

[Research] AI training is bad, so I started an research

Hello, I started researching about AI training Q:Why? R: Because AI training is bad right now. Q: What do you mean its bad? R: Like when ...

Reddit - Machine Learning · 1 min · 17 minutes ago

Machine Learning

[P] Unix philosophy for ML pipelines: modular, swappable stages with typed contracts

We built an open-source prototype that applies Unix philosophy to retrieval pipelines. Each stage (PII redaction, chunking, dedup, embedd...

Reddit - Machine Learning · 1 min · about 3 hours ago

Machine Learning

Making an AI native sovereign computational stack

I’ve been working on a personal project that ended up becoming a kind of full computing stack: identity / trust protocol decentralized ch...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

[2511.01266] MotionStream: Real-Time Video Generation with Interactive Motion Controls

About this article

Related Articles

[D] Got my first offer after months of searching — below posted range, contract-to-hire, and worried it may pause my search. Do I take it?

[Research] AI training is bad, so I started an research

[P] Unix philosophy for ML pipelines: modular, swappable stages with typed contracts

Making an AI native sovereign computational stack

No comments

Stay updated with AI News