[2601.22451] Countering the Over-Reliance Trap: Mitigating Object Hallucination for LVLMs via a Self-Validation Framework

[2601.22451] Countering the Over-Reliance Trap: Mitigating Object Hallucination for LVLMs via a Self-Validation Framework

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2601.22451: Countering the Over-Reliance Trap: Mitigating Object Hallucination for LVLMs via a Self-Validation Framework

Computer Science > Computer Vision and Pattern Recognition arXiv:2601.22451 (cs) [Submitted on 30 Jan 2026 (v1), last revised 8 Apr 2026 (this version, v2)] Title:Countering the Over-Reliance Trap: Mitigating Object Hallucination for LVLMs via a Self-Validation Framework Authors:Shiyu Liu, Xinyi Wen, Zhibin Lan, Ante Wang, Jinsong Su View a PDF of the paper titled Countering the Over-Reliance Trap: Mitigating Object Hallucination for LVLMs via a Self-Validation Framework, by Shiyu Liu and 4 other authors View PDF HTML (experimental) Abstract:Despite progress in Large Vision Language Models (LVLMs), object hallucination remains a critical issue in image captioning task, where models generate descriptions of non-existent objects, compromising their reliability. Previous work attributes this to LVLMs' over-reliance on language priors and attempts to mitigate it through logits calibration. However, they still lack a thorough analysis of the over-reliance. To gain a deeper understanding of over-reliance, we conduct a series of preliminary experiments, indicating that as the generation length increases, LVLMs' over-reliance on language priors leads to inflated probability of hallucinated object tokens, consequently exacerbating object hallucination. To circumvent this issue, we propose Language-Prior-Free Verification to enable LVLMs to faithfully verify the confidence of object existence. Based on this, we propose a novel training-free Self-Validation Framework to counter the o...

Originally published on April 09, 2026. Curated by AI News.

Related Articles

Llms

Vance says Iran sent 3 different versions of 10-point proposal, one of them 'written by ChatGPT'

submitted by /u/esporx [link] [comments]

Reddit - Artificial Intelligence · 1 min ·
[2601.21463] Unifying Speech Editing Detection and Content Localization via Prior-Enhanced Audio LLMs
Llms

[2601.21463] Unifying Speech Editing Detection and Content Localization via Prior-Enhanced Audio LLMs

Abstract page for arXiv paper 2601.21463: Unifying Speech Editing Detection and Content Localization via Prior-Enhanced Audio LLMs

arXiv - AI · 4 min ·
[2601.16206] Computer Environments Elicit General Agentic Intelligence in LLMs
Llms

[2601.16206] Computer Environments Elicit General Agentic Intelligence in LLMs

Abstract page for arXiv paper 2601.16206: Computer Environments Elicit General Agentic Intelligence in LLMs

arXiv - AI · 4 min ·
[2601.15356] Q-Probe: Scaling Image Quality Assessment to High Resolution via Context-Aware Agentic Probing
Llms

[2601.15356] Q-Probe: Scaling Image Quality Assessment to High Resolution via Context-Aware Agentic Probing

Abstract page for arXiv paper 2601.15356: Q-Probe: Scaling Image Quality Assessment to High Resolution via Context-Aware Agentic Probing

arXiv - AI · 4 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime