[2603.03002] SpatialText: A Pure-Text Cognitive Benchmark for Spatial Understanding in Large Language Models
About this article
Abstract page for arXiv paper 2603.03002: SpatialText: A Pure-Text Cognitive Benchmark for Spatial Understanding in Large Language Models
Computer Science > Artificial Intelligence arXiv:2603.03002 (cs) [Submitted on 3 Mar 2026] Title:SpatialText: A Pure-Text Cognitive Benchmark for Spatial Understanding in Large Language Models Authors:Peiyao Jiang, Zequn Qin, Xi Li View a PDF of the paper titled SpatialText: A Pure-Text Cognitive Benchmark for Spatial Understanding in Large Language Models, by Peiyao Jiang and 2 other authors View PDF HTML (experimental) Abstract:Genuine spatial reasoning relies on the capacity to construct and manipulate coherent internal spatial representations, often conceptualized as mental models, rather than merely processing surface linguistic associations. While large language models exhibit advanced capabilities across various domains, existing benchmarks fail to isolate this intrinsic spatial cognition from statistical language heuristics. Furthermore, multimodal evaluations frequently conflate genuine spatial reasoning with visual perception. To systematically investigate whether models construct flexible spatial mental models, we introduce SpatialText, a theory-driven diagnostic framework. Rather than functioning simply as a dataset, SpatialText isolates text-based spatial reasoning through a dual-source methodology. It integrates human-annotated descriptions of real 3D indoor environments, which capture natural ambiguities, perspective shifts, and functional relations, with code-generated, logically precise scenes designed to probe formal spatial deduction and epistemic bounda...