[2603.23501] MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage
About this article
Abstract page for arXiv paper 2603.23501: MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage
Computer Science > Computer Vision and Pattern Recognition arXiv:2603.23501 (cs) [Submitted on 24 Mar 2026] Title:MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage Authors:Ufaq Khan, Umair Nawaz, L D M S S Teja, Numaan Saeed, Muhammad Bilal, Yutong Xie, Mohammad Yaqub, Muhammad Haris Khan View a PDF of the paper titled MedObvious: Exposing the Medical Moravec's Paradox in VLMs via Clinical Triage, by Ufaq Khan and 7 other authors View PDF HTML (experimental) Abstract:Vision Language Models (VLMs) are increasingly used for tasks like medical report generation and visual question answering. However, fluent diagnostic text does not guarantee safe visual understanding. In clinical practice, interpretation begins with pre-diagnostic sanity checks: verifying that the input is valid to read (correct modality and anatomy, plausible viewpoint and orientation, and no obvious integrity violations). Existing benchmarks largely assume this step is solved, and therefore miss a critical failure mode: a model can produce plausible narratives even when the input is inconsistent or invalid. We introduce MedObvious, a 1,880-task benchmark that isolates input validation as a set-level consistency capability over small multi-panel image sets: the model must identify whether any panel violates expected coherence. MedObvious spans five progressive tiers, from basic orientation/modality mismatches to clinically motivated anatomy/viewpoint verification and triage-style...