[2510.10285] Reallocating Attention Across Layers to Reduce Multimodal Hallucination
About this article
Abstract page for arXiv paper 2510.10285: Reallocating Attention Across Layers to Reduce Multimodal Hallucination
Computer Science > Artificial Intelligence arXiv:2510.10285 (cs) [Submitted on 11 Oct 2025 (v1), last revised 27 Feb 2026 (this version, v2)] Title:Reallocating Attention Across Layers to Reduce Multimodal Hallucination Authors:Haolang Lu, Bolun Chu, WeiYe Fu, Guoshun Nan, Junning Liu, Minghui Pan, Qiankun Li, Yi Yu, Hua Wang, Kun Wang View a PDF of the paper titled Reallocating Attention Across Layers to Reduce Multimodal Hallucination, by Haolang Lu and 9 other authors View PDF HTML (experimental) Abstract:Multimodal large reasoning models (MLRMs) often suffer from hallucinations that stem not only from insufficient visual grounding but also from imbalanced allocation between perception and reasoning processes. Building upon recent interpretability findings suggesting a staged division of attention across layers, we analyze how this functional misalignment leads to two complementary failure modes: perceptual bias in shallow layers and reasoning drift in deeper layers. To alleviate these issues, we propose Functional Head Identification and Class-Conditioned Rescaling , a lightweight, training-free plugin that identifies perception- and reasoning-oriented heads and adaptively rebalances their layerwise contributions. Our method improves reasoning consistency and visual faithfulness without retraining or any architectural modification. Evaluations across three representative MLRMs and five multimodal reasoning benchmarks show an average 4.2% point gain, with less than 1% a...