[2603.28258] Categorical Perception in Large Language Model Hidden States: Structural Warping at Digit-Count Boundaries
About this article
Abstract page for arXiv paper 2603.28258: Categorical Perception in Large Language Model Hidden States: Structural Warping at Digit-Count Boundaries
Computer Science > Computation and Language arXiv:2603.28258 (cs) [Submitted on 30 Mar 2026] Title:Categorical Perception in Large Language Model Hidden States: Structural Warping at Digit-Count Boundaries Authors:Jon-Paul Cacioli View a PDF of the paper titled Categorical Perception in Large Language Model Hidden States: Structural Warping at Digit-Count Boundaries, by Jon-Paul Cacioli View PDF HTML (experimental) Abstract:Categorical perception (CP) -- enhanced discriminability at category boundaries -- is among the most studied phenomena in perceptual psychology. This paper reports that analogous geometric warping occurs in the hidden-state representations of large language models (LLMs) processing Arabic numerals. Using representational similarity analysis across six models from five architecture families, the study finds that a CP-additive model (log-distance plus a boundary boost) fits the representational geometry better than a purely continuous model at 100% of primary layers in every model tested. The effect is specific to structurally defined boundaries (digit-count transitions at 10 and 100), absent at non-boundary control positions, and absent in the temperature domain where linguistic categories (hot/cold) lack a tokenisation discontinuity. Two qualitatively distinct signatures emerge: "classic CP" (Gemma, Qwen), where models both categorise explicitly and show geometric warping, and "structural CP" (Llama, Mistral, Phi), where geometry warps at the boundary b...