[2603.21210] Pretrained Video Models as Differentiable Physics

[2603.21210] Pretrained Video Models as Differentiable Physics Simulators for Urban Wind Flows

arXiv - Machine Learning March 24, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.21210: Pretrained Video Models as Differentiable Physics Simulators for Urban Wind Flows

Computer Science > Machine Learning arXiv:2603.21210 (cs) [Submitted on 22 Mar 2026] Title:Pretrained Video Models as Differentiable Physics Simulators for Urban Wind Flows Authors:Janne Perini, Rafael Bischof, Moab Arar, Ayça Duran, Michael A. Kraus, Siddhartha Mishra, Bernd Bickel View a PDF of the paper titled Pretrained Video Models as Differentiable Physics Simulators for Urban Wind Flows, by Janne Perini and 6 other authors View PDF HTML (experimental) Abstract:Designing urban spaces that provide pedestrian wind comfort and safety requires time-resolved Computational Fluid Dynamics (CFD) simulations, but their current computational cost makes extensive design exploration impractical. We introduce WinDiNet (Wind Diffusion Network), a pretrained video diffusion model that is repurposed as a fast, differentiable surrogate for this task. Starting from LTX-Video, a 2B-parameter latent video transformer, we fine-tune on 10,000 2D incompressible CFD simulations over procedurally generated building layouts. A systematic study of training regimes, conditioning mechanisms, and VAE adaptation strategies, including a physics-informed decoder loss, identifies a configuration that outperforms purpose-built neural PDE solvers. The resulting model generates full 112-frame rollouts in under a second. As the surrogate is end-to-end differentiable, it doubles as a physics simulator for gradient-based inverse optimization: given an urban footprint layout, we optimize building positions ...

Originally published on March 24, 2026. Curated by AI News.

Machine Learning

[D] It’s 2026. Can we finally admit TensorFlow is the "COBOL of Machine Learning"?

We keep telling students to learn both, but let’s look at the actual landscape: Research: 95%+ of HuggingFace and arXiv is PyTorch. Innov...

Reddit - Machine Learning · 1 min · about 2 hours ago

Machine Learning

I have question for people who got job

how you guys getting job in ml as a fresher ?? I am in college. havent started learning ml but willing to . let me know exactly how to do...

Reddit - ML Jobs · 1 min · about 5 hours ago

Llms

🤖 AI News Digest - March 27, 2026

Today's AI news: 1. My minute-by-minute response to the LiteLLM malware attack The article describes a detailed, minute-by-minute respons...

Reddit - Artificial Intelligence · 1 min · about 5 hours ago

Llms

[D] Real-time Student Attention Detection: ResNet vs Facial Landmarks - Which approach for resource-constrained deployment?

I have a problem statement where we are supposed to detect the attention level of student in a classroom, basically output whether he is ...

Reddit - Machine Learning · 1 min · about 6 hours ago

[2603.21210] Pretrained Video Models as Differentiable Physics Simulators for Urban Wind Flows

About this article

Related Articles

[D] It’s 2026. Can we finally admit TensorFlow is the "COBOL of Machine Learning"?

I have question for people who got job

🤖 AI News Digest - March 27, 2026

[D] Real-time Student Attention Detection: ResNet vs Facial Landmarks - Which approach for resource-constrained deployment?

No comments

Stay updated with AI News