[2601.16933] Reward-Forcing: Autoregressive Video Generation with Reward Feedback
About this article
Abstract page for arXiv paper 2601.16933: Reward-Forcing: Autoregressive Video Generation with Reward Feedback
Computer Science > Computer Vision and Pattern Recognition arXiv:2601.16933 (cs) [Submitted on 23 Jan 2026 (v1), last revised 2 Apr 2026 (this version, v2)] Title:Reward-Forcing: Autoregressive Video Generation with Reward Feedback Authors:Jingran Zhang, Ning Li, Yuanhao Ban, Andrew Bai, Justin Cui View a PDF of the paper titled Reward-Forcing: Autoregressive Video Generation with Reward Feedback, by Jingran Zhang and 4 other authors View PDF HTML (experimental) Abstract:While most prior work in video generation relies on bidirectional architectures, recent efforts have sought to adapt these models into autoregressive variants to support near real-time generation. However, such adaptations often depend heavily on teacher models, which can limit performance, particularly in the absence of a strong autoregressive teacher, resulting in output quality that typically lags behind their bidirectional counterparts. In this paper, we explore an alternative approach that uses reward signals to guide the generation process, enabling more efficient and scalable autoregressive generation. By using reward signals to guide the model, our method simplifies training while preserving high visual fidelity and temporal consistency. Through extensive experiments on standard benchmarks, we find that our approach performs comparably to existing autoregressive models and, in some cases, surpasses similarly sized bidirectional models by avoiding constraints imposed by teacher architectures. For ex...