[2603.14294] Seeking Physics in Diffusion Noise
About this article
Abstract page for arXiv paper 2603.14294: Seeking Physics in Diffusion Noise
Computer Science > Computer Vision and Pattern Recognition arXiv:2603.14294 (cs) [Submitted on 15 Mar 2026 (v1), last revised 26 Mar 2026 (this version, v2)] Title:Seeking Physics in Diffusion Noise Authors:Chujun Tang, Lei Zhong, Fangqiang Ding View a PDF of the paper titled Seeking Physics in Diffusion Noise, by Chujun Tang and 2 other authors View PDF HTML (experimental) Abstract:Do video diffusion models encode signals predictive of physical plausibility? We probe intermediate denoising representations of a pretrained Diffusion Transformer (DiT) and find that physically plausible and implausible videos are partially separable in mid-layer feature space across noise levels. This separability cannot be fully attributed to visual quality or generator identity, suggesting recoverable physics-related cues in frozen DiT features. Leveraging this observation, we introduce progressive trajectory selection, an inference-time strategy that scores parallel denoising trajectories at a few intermediate checkpoints using a lightweight physics verifier trained on frozen features, and prunes low-scoring candidates early. Extensive experiments on PhyGenBench demonstrate that our method improves physical consistency while reducing inference cost, achieving comparable results to Best-of-K sampling with substantially fewer denoising steps. Comments: Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO) Cite a...