Machine Learning Computer Vision Ai Safety

[2602.20068] The Invisible Gorilla Effect in Out-of-distribution Detection

arXiv - Machine Learning February 24, 2026 4 min read Article

Summary

The paper explores the 'Invisible Gorilla Effect' in out-of-distribution (OOD) detection, revealing that detection performance varies based on visual similarity between artefacts and regions of interest in images.

Why It Matters

Understanding the biases in OOD detection is crucial for developing more reliable AI systems. This research highlights a significant failure mode that can lead to misclassifications, impacting applications in critical areas such as medical imaging and autonomous systems.

Key Takeaways

The 'Invisible Gorilla Effect' describes how detection performance improves with visual similarity to the model's ROI.
Detection methods show significant performance drops when artefacts differ from the ROI in color.
The study evaluated 40 OOD methods across 7 benchmarks, revealing an overlooked failure mode.
Annotated artefacts in 11,355 images to substantiate findings.
Findings provide guidance for developing more robust OOD detection systems.

Computer Science > Computer Vision and Pattern Recognition arXiv:2602.20068 (cs) [Submitted on 23 Feb 2026] Title:The Invisible Gorilla Effect in Out-of-distribution Detection Authors:Harry Anthony, Ziyun Liang, Hermione Warr, Konstantinos Kamnitsas View a PDF of the paper titled The Invisible Gorilla Effect in Out-of-distribution Detection, by Harry Anthony and 3 other authors View PDF HTML (experimental) Abstract:Deep Neural Networks achieve high performance in vision tasks by learning features from regions of interest (ROI) within images, but their performance degrades when deployed on out-of-distribution (OOD) data that differs from training data. This challenge has led to OOD detection methods that aim to identify and reject unreliable predictions. Although prior work shows that OOD detection performance varies by artefact type, the underlying causes remain underexplored. To this end, we identify a previously unreported bias in OOD detection: for hard-to-detect artefacts (near-OOD), detection performance typically improves when the artefact shares visual similarity (e.g. colour) with the model's ROI and drops when it does not - a phenomenon we term the Invisible Gorilla Effect. For example, in a skin lesion classifier with red lesion ROI, we show the method Mahalanobis Score achieves a 31.5% higher AUROC when detecting OOD red ink (similar to ROI) compared to black ink (dissimilar) annotations. We annotated artefacts by colour in 11,355 images from three public datase...

Read Original Article

[2602.20068] The Invisible Gorilla Effect in Out-of-distribution Detection

Summary

Why It Matters

Key Takeaways

Related Articles

CLI for Google AI Search (gai.google) — run AI-powered code/tech searches headlessly from your terminal

Big increase in the amount of people using AI to write their replies with AI

[D] MXFP8 GEMM: Up to 99% of cuBLAS performance using CUDA + PTX

IIT Delhi launches 8th batch of Advanced AI, ML, and DL online programme: Check who is eligible, applicat

No comments

Stay updated with AI News