[2510.02282] VidGuard-R1: AI-Generated Video Detection and Explanation via Reasoning MLLMs and RL
About this article
Abstract page for arXiv paper 2510.02282: VidGuard-R1: AI-Generated Video Detection and Explanation via Reasoning MLLMs and RL
Computer Science > Computer Vision and Pattern Recognition arXiv:2510.02282 (cs) [Submitted on 2 Oct 2025 (v1), last revised 5 Mar 2026 (this version, v3)] Title:VidGuard-R1: AI-Generated Video Detection and Explanation via Reasoning MLLMs and RL Authors:Kyoungjun Park, Yifan Yang, Juheon Yi, Shicheng Zheng, Yifei Shen, Dongqi Han, Caihua Shan, Muhammad Muaz, Lili Qiu View a PDF of the paper titled VidGuard-R1: AI-Generated Video Detection and Explanation via Reasoning MLLMs and RL, by Kyoungjun Park and 8 other authors View PDF HTML (experimental) Abstract:The rapid proliferation of AI-generated video necessitates robust detection tools that offer both high accuracy and human-interpretable explanations. While existing MLLM-based detectors rely on supervised fine-tuning (SFT) or direct preference optimization (DPO), these methods are often bottlenecked by static, pre-labeled datasets that fail to capture the evolving, multi-step physical inconsistencies of modern generative models. To bridge this gap, we introduce VidGuard-R1, the first video authenticity detector to utilize group relative policy optimization (GRPO). Moving beyond passive preference matching, VidGuard-R1 employs a reinforcement learning framework that encourages the model to explore and rank multiple reasoning paths. By introducing specialized reward models for temporal stability and diffusion-aware complexity, we incentivize the model to discover 'physics-grounded' artifacts. Our contributions include: (1...