[2509.13789] BWCache: Accelerating Video Diffusion Transformers through Block-Wise Caching
About this article
Abstract page for arXiv paper 2509.13789: BWCache: Accelerating Video Diffusion Transformers through Block-Wise Caching
Computer Science > Computer Vision and Pattern Recognition arXiv:2509.13789 (cs) [Submitted on 17 Sep 2025 (v1), last revised 28 Feb 2026 (this version, v3)] Title:BWCache: Accelerating Video Diffusion Transformers through Block-Wise Caching Authors:Hanshuai Cui, Zhiqing Tang, Zhifei Xu, Zhi Yao, Wenyi Zeng, Weijia Jia View a PDF of the paper titled BWCache: Accelerating Video Diffusion Transformers through Block-Wise Caching, by Hanshuai Cui and Zhiqing Tang and Zhifei Xu and Zhi Yao and Wenyi Zeng and Weijia Jia View PDF HTML (experimental) Abstract:Recent advancements in Diffusion Transformers (DiTs) have established them as the state-of-the-art method for video generation. However, their inherently sequential denoising process results in inevitable latency, limiting real-world applicability. Existing acceleration methods either compromise visual quality due to architectural modifications or fail to reuse intermediate features at proper granularity. Our analysis reveals that DiT blocks are the primary contributors to inference latency. Across diffusion timesteps, the feature variations of DiT blocks exhibit a U-shaped pattern with high similarity during intermediate timesteps, which suggests substantial computational redundancy. In this paper, we propose Block-Wise Caching (BWCache), a training-free method to accelerate DiT-based video generation. BWCache dynamically caches and reuses features from DiT blocks across diffusion timesteps. Furthermore, we introduce a simil...