[2603.26571] Generation Is Compression: Zero-Shot Video Coding via Stochastic Rectified Flow
About this article
Abstract page for arXiv paper 2603.26571: Generation Is Compression: Zero-Shot Video Coding via Stochastic Rectified Flow
Computer Science > Computer Vision and Pattern Recognition arXiv:2603.26571 (cs) [Submitted on 27 Mar 2026] Title:Generation Is Compression: Zero-Shot Video Coding via Stochastic Rectified Flow Authors:Ziyue Zeng, Xun Su, Haoyuan Liu, Bingyu Lu, Yui Tatsumi, Hiroshi Watanabe View a PDF of the paper titled Generation Is Compression: Zero-Shot Video Coding via Stochastic Rectified Flow, by Ziyue Zeng and 5 other authors View PDF HTML (experimental) Abstract:Existing generative video compression methods use generative models only as post-hoc reconstruction modules atop conventional codecs. We propose \emph{Generative Video Codec} (GVC), a zero-shot framework that turns a pretrained video generative model into the codec itself: the transmitted bitstream directly specifies the generative decoding trajectory, with no retraining required. To enable this, we convert the deterministic rectified-flow ODE of modern video foundation models into an equivalent SDE at inference time, unlocking per-step stochastic injection points for codebook-driven compression. Building on this unified backbone, we instantiate three complementary conditioning strategies -- \emph{Image-to-Video} (I2V) with adaptive tail-frame atom allocation, \emph{Text-to-Video} (T2V) operating at near-zero side information as a pure generative prior, and \emph{First-Last-Frame-to-Video} (FLF2V) with boundary-sharing GOP chaining for dual-anchor temporal control. Together, these variants span a principled trade-off spac...