[2603.21210] Pretrained Video Models as Differentiable Physics Simulators for Urban Wind Flows
About this article
Abstract page for arXiv paper 2603.21210: Pretrained Video Models as Differentiable Physics Simulators for Urban Wind Flows
Computer Science > Machine Learning arXiv:2603.21210 (cs) [Submitted on 22 Mar 2026] Title:Pretrained Video Models as Differentiable Physics Simulators for Urban Wind Flows Authors:Janne Perini, Rafael Bischof, Moab Arar, Ayça Duran, Michael A. Kraus, Siddhartha Mishra, Bernd Bickel View a PDF of the paper titled Pretrained Video Models as Differentiable Physics Simulators for Urban Wind Flows, by Janne Perini and 6 other authors View PDF HTML (experimental) Abstract:Designing urban spaces that provide pedestrian wind comfort and safety requires time-resolved Computational Fluid Dynamics (CFD) simulations, but their current computational cost makes extensive design exploration impractical. We introduce WinDiNet (Wind Diffusion Network), a pretrained video diffusion model that is repurposed as a fast, differentiable surrogate for this task. Starting from LTX-Video, a 2B-parameter latent video transformer, we fine-tune on 10,000 2D incompressible CFD simulations over procedurally generated building layouts. A systematic study of training regimes, conditioning mechanisms, and VAE adaptation strategies, including a physics-informed decoder loss, identifies a configuration that outperforms purpose-built neural PDE solvers. The resulting model generates full 112-frame rollouts in under a second. As the surrogate is end-to-end differentiable, it doubles as a physics simulator for gradient-based inverse optimization: given an urban footprint layout, we optimize building positions ...