[2604.03179] Understanding the Role of Hallucination in Reinforcement Post-Training of Multimodal Reasoning Models
About this article
Abstract page for arXiv paper 2604.03179: Understanding the Role of Hallucination in Reinforcement Post-Training of Multimodal Reasoning Models
Computer Science > Machine Learning arXiv:2604.03179 (cs) [Submitted on 3 Apr 2026] Title:Understanding the Role of Hallucination in Reinforcement Post-Training of Multimodal Reasoning Models Authors:Gengwei Zhang, Jie Peng, Zhen Tan, Mufan Qiu, Hossein Nourkhiz Mahjoub, Vaishnav Tadiparthi, Kwonjoon Lee, Yanyong Zhang, Tianlong Chen View a PDF of the paper titled Understanding the Role of Hallucination in Reinforcement Post-Training of Multimodal Reasoning Models, by Gengwei Zhang and 8 other authors View PDF HTML (experimental) Abstract:The recent success of reinforcement learning (RL) in large reasoning models has inspired the growing adoption of RL for post-training Multimodal Large Language Models (MLLMs) to enhance their visual reasoning capabilities. Although many studies have reported improved performance, it remains unclear whether RL training truly enables models to learn from visual information. In this work, we propose the Hallucination-as-Cue Framework, an analytical framework designed to investigate the effects of RL-based post-training on multimodal reasoning models from the perspective of model hallucination. Specifically, we introduce hallucination-inductive, modality-specific corruptions that remove or replace essential information required to derive correct answers, thereby forcing the model to reason by hallucination. By applying these corruptions during both training and evaluation, our framework provides a unique perspective for diagnosing RL training...