[2601.22451] Countering the Over-Reliance Trap: Mitigating Object Hallucination for LVLMs via a Self-Validation Framework
About this article
Abstract page for arXiv paper 2601.22451: Countering the Over-Reliance Trap: Mitigating Object Hallucination for LVLMs via a Self-Validation Framework
Computer Science > Computer Vision and Pattern Recognition arXiv:2601.22451 (cs) [Submitted on 30 Jan 2026 (v1), last revised 8 Apr 2026 (this version, v2)] Title:Countering the Over-Reliance Trap: Mitigating Object Hallucination for LVLMs via a Self-Validation Framework Authors:Shiyu Liu, Xinyi Wen, Zhibin Lan, Ante Wang, Jinsong Su View a PDF of the paper titled Countering the Over-Reliance Trap: Mitigating Object Hallucination for LVLMs via a Self-Validation Framework, by Shiyu Liu and 4 other authors View PDF HTML (experimental) Abstract:Despite progress in Large Vision Language Models (LVLMs), object hallucination remains a critical issue in image captioning task, where models generate descriptions of non-existent objects, compromising their reliability. Previous work attributes this to LVLMs' over-reliance on language priors and attempts to mitigate it through logits calibration. However, they still lack a thorough analysis of the over-reliance. To gain a deeper understanding of over-reliance, we conduct a series of preliminary experiments, indicating that as the generation length increases, LVLMs' over-reliance on language priors leads to inflated probability of hallucinated object tokens, consequently exacerbating object hallucination. To circumvent this issue, we propose Language-Prior-Free Verification to enable LVLMs to faithfully verify the confidence of object existence. Based on this, we propose a novel training-free Self-Validation Framework to counter the o...