[2603.28258] Categorical Perception in Large Language Model Hidden

[2603.28258] Categorical Perception in Large Language Model Hidden States: Structural Warping at Digit-Count Boundaries

arXiv - AI March 31, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.28258: Categorical Perception in Large Language Model Hidden States: Structural Warping at Digit-Count Boundaries

Computer Science > Computation and Language arXiv:2603.28258 (cs) [Submitted on 30 Mar 2026] Title:Categorical Perception in Large Language Model Hidden States: Structural Warping at Digit-Count Boundaries Authors:Jon-Paul Cacioli View a PDF of the paper titled Categorical Perception in Large Language Model Hidden States: Structural Warping at Digit-Count Boundaries, by Jon-Paul Cacioli View PDF HTML (experimental) Abstract:Categorical perception (CP) -- enhanced discriminability at category boundaries -- is among the most studied phenomena in perceptual psychology. This paper reports that analogous geometric warping occurs in the hidden-state representations of large language models (LLMs) processing Arabic numerals. Using representational similarity analysis across six models from five architecture families, the study finds that a CP-additive model (log-distance plus a boundary boost) fits the representational geometry better than a purely continuous model at 100% of primary layers in every model tested. The effect is specific to structurally defined boundaries (digit-count transitions at 10 and 100), absent at non-boundary control positions, and absent in the temperature domain where linguistic categories (hot/cold) lack a tokenisation discontinuity. Two qualitatively distinct signatures emerge: "classic CP" (Gemma, Qwen), where models both categorise explicitly and show geometric warping, and "structural CP" (Llama, Mistral, Phi), where geometry warps at the boundary b...

Originally published on March 31, 2026. Curated by AI News.

Llms

Depth-first pruning seems to transfer from GPT-2 to Llama (unexpectedly well)

TL;DR: Removing the right transformer layers (instead of shrinking all layers) gives smaller, faster models with minimal quality loss — a...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

[2603.23966] Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage

Abstract page for arXiv paper 2603.23966: Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage

arXiv - AI · 4 min · about 2 hours ago

Llms

[2603.16790] InCoder-32B: Code Foundation Model for Industrial Scenarios

Abstract page for arXiv paper 2603.16790: InCoder-32B: Code Foundation Model for Industrial Scenarios

arXiv - AI · 4 min · about 2 hours ago

Llms

[2603.16430] EngGPT2: Sovereign, Efficient and Open Intelligence

Abstract page for arXiv paper 2603.16430: EngGPT2: Sovereign, Efficient and Open Intelligence

arXiv - AI · 4 min · about 2 hours ago

[2603.28258] Categorical Perception in Large Language Model Hidden States: Structural Warping at Digit-Count Boundaries

About this article

Related Articles

Depth-first pruning seems to transfer from GPT-2 to Llama (unexpectedly well)

[2603.23966] Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage

[2603.16790] InCoder-32B: Code Foundation Model for Industrial Scenarios

[2603.16430] EngGPT2: Sovereign, Efficient and Open Intelligence

No comments

Stay updated with AI News