[2511.21471] SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition

[2511.21471] SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2511.21471: SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition

Computer Science > Artificial Intelligence arXiv:2511.21471 (cs) [Submitted on 26 Nov 2025 (v1), last revised 4 Mar 2026 (this version, v2)] Title:SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition Authors:Peiran Xu, Sudong Wang, Yao Zhu, Jianing Li, Gege Qi, Yunjian Zhang View a PDF of the paper titled SpatialBench: Benchmarking Multimodal Large Language Models for Spatial Cognition, by Peiran Xu and 5 other authors View PDF HTML (experimental) Abstract:Spatial cognition is fundamental to real-world multimodal intelligence, allowing models to effectively interact with the physical environment. While multimodal large language models (MLLMs) have made significant strides, existing benchmarks often oversimplify spatial cognition, reducing it to a single-dimensional metric, which fails to capture the hierarchical structure and interdependence of spatial abilities. To address this gap, we propose a hierarchical spatial cognition framework that decomposes spatial intelligence into five progressively complex levels from basic observation to high-level planning. Building upon this taxonomy, we construct SpatialBench, a large-scale, fine-grained benchmark covering 15 tasks aligned with these cognitive levels. To provide a unified evaluation across heterogeneous tasks, we further introduce a high-level capability-oriented metric that reliably assesses a model's overall spatial reasoning ability. Extensive experiments over massive MLLMs reveal distinct...

Originally published on March 05, 2026. Curated by AI News.

Related Articles

Florida's attorney general launches probe into Open AI, Chat GPT
Llms

Florida's attorney general launches probe into Open AI, Chat GPT

AI Tools & Products · 1 min ·
The Gemini app can now generate interactive simulations and models.
Llms

The Gemini app can now generate interactive simulations and models.

AI Tools & Products · 1 min ·
AI on the couch: Anthropic gives Claude 20 hours of psychiatry
Llms

AI on the couch: Anthropic gives Claude 20 hours of psychiatry

AI Tools & Products · 6 min ·
Moody’s Integrates AI Agents With Anthropic’s Claude
Llms

Moody’s Integrates AI Agents With Anthropic’s Claude

AI Tools & Products · 4 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime