Llms Machine Learning Ai Safety Computer Vision

[2602.15892] Egocentric Bias in Vision-Language Models

arXiv - AI February 19, 2026 3 min read Article

Summary

The paper introduces FlipSet, a benchmark for assessing visual perspective taking in vision-language models, revealing significant egocentric bias in their performance.

Why It Matters

Understanding egocentric bias in vision-language models is crucial for improving AI's social cognition and spatial reasoning capabilities. The findings highlight limitations in current models, which could inform future research and development in AI systems that require perspective-taking abilities.

Key Takeaways

FlipSet benchmark assesses Level-2 visual perspective taking in VLMs.
Most vision-language models exhibit systematic egocentric bias.
Models perform well in isolation but fail in integrated tasks requiring spatial reasoning.
Current VLMs lack mechanisms to bind social awareness with spatial operations.
The study provides insights for improving AI's cognitive capabilities.

Computer Science > Computer Vision and Pattern Recognition arXiv:2602.15892 (cs) [Submitted on 10 Feb 2026] Title:Egocentric Bias in Vision-Language Models Authors:Maijunxian Wang, Yijiang Li, Bingyang Wang, Tianwei Zhao, Ran Ji, Qingying Gao, Emmy Liu, Hokin Deng, Dezhi Luo View a PDF of the paper titled Egocentric Bias in Vision-Language Models, by Maijunxian Wang and 8 other authors View PDF HTML (experimental) Abstract:Visual perspective taking--inferring how the world appears from another's viewpoint--is foundational to social cognition. We introduce FlipSet, a diagnostic benchmark for Level-2 visual perspective taking (L2 VPT) in vision-language models. The task requires simulating 180-degree rotations of 2D character strings from another agent's perspective, isolating spatial transformation from 3D scene complexity. Evaluating 103 VLMs reveals systematic egocentric bias: the vast majority perform below chance, with roughly three-quarters of errors reproducing the camera viewpoint. Control experiments expose a compositional deficit--models achieve high theory-of-mind accuracy and above-chance mental rotation in isolation, yet fail catastrophically when integration is required. This dissociation indicates that current VLMs lack the mechanisms needed to bind social awareness to spatial operations, suggesting fundamental limitations in model-based spatial reasoning. FlipSet provides a cognitively grounded testbed for diagnosing perspective-taking capabilities in multimo...

Read Original Article

Llms

[D] Howcome Muon is only being used for Transformers?

Muon has quickly been adopted in LLM training, yet we don't see it being talked about in other contexts. Searches for Muon on ConvNets tu...

Reddit - Machine Learning · 1 min · 6 minutes ago

Llms

[P] I trained a language model from scratch for a low resource language and got it running fully on-device on Android (no GPU, demo)

Hi Everybody! I just wanted to share an update on a project I’ve been working on called BULaMU, a family of language models trained (20M,...

Reddit - Machine Learning · 1 min · about 1 hour ago

Llms

Paper Finds That Leading AI Chatbots Like ChatGPT and Claude Remain Incredibly Sycophantic, Resulting in Twisted Effects on Users

A study found that sycophancy is pervasive among chatbots, and that bots are more likely than human peers to affirm a person's bad behavior.

AI Tools & Products · 6 min · about 2 hours ago

Llms

Popular AI gateway startup LiteLLM ditches controversial startup Delve | TechCrunch

LiteLLM had obtained two security compliance certifications via Delve and fell victim to some horrific credential-stealing malware last w...

TechCrunch - AI · 3 min · about 4 hours ago

[2602.15892] Egocentric Bias in Vision-Language Models

Summary

Why It Matters

Key Takeaways

Related Articles

[D] Howcome Muon is only being used for Transformers?

[P] I trained a language model from scratch for a low resource language and got it running fully on-device on Android (no GPU, demo)

Paper Finds That Leading AI Chatbots Like ChatGPT and Claude Remain Incredibly Sycophantic, Resulting in Twisted Effects on Users

Popular AI gateway startup LiteLLM ditches controversial startup Delve | TechCrunch

No comments

Stay updated with AI News