[2603.21687] Mirage The Illusion of Visual Understanding
About this article
Abstract page for arXiv paper 2603.21687: Mirage The Illusion of Visual Understanding
Computer Science > Artificial Intelligence arXiv:2603.21687 (cs) [Submitted on 23 Mar 2026] Title:Mirage The Illusion of Visual Understanding Authors:Mohammad Asadi, Jack W. O'Sullivan, Fang Cao, Tahoura Nedaee, Kamyar Fardi, Fei-Fei Li, Ehsan Adeli, Euan Ashley View a PDF of the paper titled Mirage The Illusion of Visual Understanding, by Mohammad Asadi and 7 other authors View PDF HTML (experimental) Abstract:Multimodal AI systems have achieved remarkable performance across a broad range of real-world tasks, yet the mechanisms underlying visual-language reasoning remain surprisingly poorly understood. We report three findings that challenge prevailing assumptions about how these systems process and integrate visual information. First, Frontier models readily generate detailed image descriptions and elaborate reasoning traces, including pathology-biased clinical findings, for images never provided; we term this phenomenon mirage reasoning. Second, without any image input, models also attain strikingly high scores across general and medical multimodal benchmarks, bringing into question their utility and design. In the most extreme case, our model achieved the top rank on a standard chest X-ray question-answering benchmark without access to any images. Third, when models were explicitly instructed to guess answers without image access, rather than being implicitly prompted to assume images were present, performance declined markedly. Explicit guessing appears to engage a mo...