[2510.02284] Learning to Generate Rigid Body Interactions with Video

[2510.02284] Learning to Generate Rigid Body Interactions with Video Diffusion Models

arXiv - Machine Learning March 24, 2026 4 min read

About this article

Abstract page for arXiv paper 2510.02284: Learning to Generate Rigid Body Interactions with Video Diffusion Models

Computer Science > Computer Vision and Pattern Recognition arXiv:2510.02284 (cs) [Submitted on 2 Oct 2025 (v1), last revised 20 Mar 2026 (this version, v3)] Title:Learning to Generate Rigid Body Interactions with Video Diffusion Models Authors:David Romero, Ariana Bermudez, Viacheslav Iablochnikov, Hao Li, Fabio Pizzati, Ivan Laptev View a PDF of the paper titled Learning to Generate Rigid Body Interactions with Video Diffusion Models, by David Romero and 5 other authors View PDF HTML (experimental) Abstract:Recent video generation models have achieved remarkable progress and are now deployed in film, social media production, and advertising. Beyond their creative potential, such models also hold promise as world simulators for robotics and embodied decision making. Despite strong advances, current approaches still struggle to generate physically plausible object interactions and lack object-level control mechanisms. To address these limitations, we introduce KineMask, an approach for video generation that enables realistic rigid body control, interactions, and effects. Given a single image and a specified object velocity, our method generates videos with inferred motions and future object interactions. We propose a two-stage training strategy that gradually removes future motion supervision via object masks. Using this strategy we train video diffusion models (VDMs) on synthetic scenes of simple interactions and demonstrate significant improvements and generalization to r...

Originally published on March 24, 2026. Curated by AI News.

Machine Learning

[P] Create datasets from TikTok videos

For ML experiments and RAG projects: Tikkocampus converts creator timelines into timestamped, searchable segments and then use it to perf...

Reddit - Machine Learning · 1 min · about 1 hour ago

Machine Learning

[D] It’s 2026. Can we finally admit TensorFlow is the "COBOL of Machine Learning"?

We keep telling students to learn both, but let’s look at the actual landscape: Research: 95%+ of HuggingFace and arXiv is PyTorch. Innov...

Reddit - Machine Learning · 1 min · about 3 hours ago

Machine Learning

I have question for people who got job

how you guys getting job in ml as a fresher ?? I am in college. havent started learning ml but willing to . let me know exactly how to do...

Reddit - ML Jobs · 1 min · about 6 hours ago

Llms

🤖 AI News Digest - March 27, 2026

Today's AI news: 1. My minute-by-minute response to the LiteLLM malware attack The article describes a detailed, minute-by-minute respons...

Reddit - Artificial Intelligence · 1 min · about 6 hours ago

[2510.02284] Learning to Generate Rigid Body Interactions with Video Diffusion Models

About this article

Related Articles

[P] Create datasets from TikTok videos

[D] It’s 2026. Can we finally admit TensorFlow is the "COBOL of Machine Learning"?

I have question for people who got job

🤖 AI News Digest - March 27, 2026

No comments

Stay updated with AI News