[2603.03989] When Visual Evidence is Ambiguous: Pareidolia as a Diagnostic Probe for Vision Models
About this article
Abstract page for arXiv paper 2603.03989: When Visual Evidence is Ambiguous: Pareidolia as a Diagnostic Probe for Vision Models
Computer Science > Computer Vision and Pattern Recognition arXiv:2603.03989 (cs) [Submitted on 4 Mar 2026] Title:When Visual Evidence is Ambiguous: Pareidolia as a Diagnostic Probe for Vision Models Authors:Qianpu Chen, Derya Soydaner, Rob Saunders View a PDF of the paper titled When Visual Evidence is Ambiguous: Pareidolia as a Diagnostic Probe for Vision Models, by Qianpu Chen and Derya Soydaner and Rob Saunders View PDF HTML (experimental) Abstract:When visual evidence is ambiguous, vision models must decide whether to interpret face-like patterns as meaningful. Face pareidolia, the perception of faces in non-face objects, provides a controlled probe of this behavior. We introduce a representation-level diagnostic framework that analyzes detection, localization, uncertainty, and bias across class, difficulty, and emotion in face pareidolia images. Under a unified protocol, we evaluate six models spanning four representational regimes: vision-language models (VLMs; CLIP-B/32, CLIP-L/14, LLaVA-1.5-7B), pure vision classification (ViT), general object detection (YOLOv8), and face detection (RetinaFace). Our analysis reveals three mechanisms of interpretation under ambiguity. VLMs exhibit semantic overactivation, systematically pulling ambiguous non-human regions toward the Human concept, with LLaVA-1.5-7B producing the strongest and most confident over-calls, especially for negative emotions. ViT instead follows an uncertainty-as-abstention strategy, remaining diffuse yet ...