[2511.21471] SpatialBench: Benchmarking Multimodal Large Language

[2511.21471] SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition

arXiv - AI March 05, 2026 4 min read

About this article

Abstract page for arXiv paper 2511.21471: SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition

Computer Science > Artificial Intelligence arXiv:2511.21471 (cs) [Submitted on 26 Nov 2025 (v1), last revised 4 Mar 2026 (this version, v2)] Title:SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition Authors:Peiran Xu, Sudong Wang, Yao Zhu, Jianing Li, Gege Qi, Yunjian Zhang View a PDF of the paper titled SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition, by Peiran Xu and 5 other authors View PDF HTML (experimental) Abstract:Spatial cognition is fundamental to real-world multimodal intelligence, allowing models to effectively interact with the physical environment. While multimodal large language models (MLLMs) have made significant strides, existing benchmarks often oversimplify spatial cognition, reducing it to a single-dimensional metric, which fails to capture the hierarchical structure and interdependence of spatial abilities. To address this gap, we propose a hierarchical spatial cognition framework that decomposes spatial intelligence into five progressively complex levels from basic observation to high-level planning. Building upon this taxonomy, we construct SpatialBench, a large-scale, fine-grained benchmark covering 15 tasks aligned with these cognitive levels. To provide a unified evaluation across heterogeneous tasks, we further introduce a high-level capability-oriented metric that reliably assesses a model's overall spatial reasoning ability. Extensive experiments over massive MLLMs reveal distinct...

Originally published on March 05, 2026. Curated by AI News.

Llms

Florida's attorney general launches probe into Open AI, Chat GPT

AI Tools & Products · 1 min · about 1 hour ago

Llms

The Gemini app can now generate interactive simulations and models.

AI Tools & Products · 1 min · about 1 hour ago

Llms

AI on the couch: Anthropic gives Claude 20 hours of psychiatry

AI Tools & Products · 6 min · about 1 hour ago

Llms

Moody’s Integrates AI Agents With Anthropic’s Claude

AI Tools & Products · 4 min · about 1 hour ago

[2511.21471] SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition

About this article

Related Articles

Florida's attorney general launches probe into Open AI, Chat GPT

The Gemini app can now generate interactive simulations and models.

AI on the couch: Anthropic gives Claude 20 hours of psychiatry

Moody’s Integrates AI Agents With Anthropic’s Claude

No comments

Stay updated with AI News