[2604.05217] On the Geometry of Positional Encodings in Transformers

[2604.05217] On the Geometry of Positional Encodings in Transformers

arXiv - Machine Learning 4 min read

About this article

Abstract page for arXiv paper 2604.05217: On the Geometry of Positional Encodings in Transformers

Computer Science > Machine Learning arXiv:2604.05217 (cs) [Submitted on 6 Apr 2026] Title:On the Geometry of Positional Encodings in Transformers Authors:Giansalvo Cirrincione View a PDF of the paper titled On the Geometry of Positional Encodings in Transformers, by Giansalvo Cirrincione View PDF HTML (experimental) Abstract:Neural language models process sequences of words, but the mathematical operations inside them are insensitive to the order in which words appear. Positional encodings are the component added to remedy this. Despite their importance, positional encodings have been designed largely by trial and error, without a mathematical theory of what they ought to do. This paper develops such a theory. Four results are established. First, any Transformer without a positional signal cannot solve any task sensitive to word order (Necessity Theorem). Second, training assigns distinct vector representations to distinct sequence positions at every global minimiser, under mild and verifiable conditions (Positional Separation Theorem). Third, the best achievable approximation to an information-optimal encoding is constructed via classical multidimensional scaling (MDS) on the Hellinger distance between positional distributions; the quality of any encoding is measured by a single number, the stress (Proposition 5, Algorithm 1). Fourth, the optimal encoding has effective rank r = rank(B) <= n-1 and can be represented with r(n+d) parameters instead of nd (minimal parametrisa...

Originally published on April 08, 2026. Curated by AI News.

Related Articles

[2604.16909] PRISM: Probing Reasoning, Instruction, and Source Memory in LLM Hallucinations
Llms

[2604.16909] PRISM: Probing Reasoning, Instruction, and Source Memory in LLM Hallucinations

Abstract page for arXiv paper 2604.16909: PRISM: Probing Reasoning, Instruction, and Source Memory in LLM Hallucinations

arXiv - AI · 4 min ·
[2604.07802] Latent Anomaly Knowledge Excavation: Unveiling Sparse Sensitive Neurons in Vision-Language Models
Llms

[2604.07802] Latent Anomaly Knowledge Excavation: Unveiling Sparse Sensitive Neurons in Vision-Language Models

Abstract page for arXiv paper 2604.07802: Latent Anomaly Knowledge Excavation: Unveiling Sparse Sensitive Neurons in Vision-Language Models

arXiv - AI · 4 min ·
[2602.07605] Fine-R1: Make Multi-modal LLMs Excel in Fine-Grained Visual Recognition by Chain-of-Thought Reasoning
Llms

[2602.07605] Fine-R1: Make Multi-modal LLMs Excel in Fine-Grained Visual Recognition by Chain-of-Thought Reasoning

Abstract page for arXiv paper 2602.07605: Fine-R1: Make Multi-modal LLMs Excel in Fine-Grained Visual Recognition by Chain-of-Thought Rea...

arXiv - AI · 4 min ·
[2602.07096] RealFin: How Well Do LLMs Reason About Finance When Users Leave Things Unsaid?
Llms

[2602.07096] RealFin: How Well Do LLMs Reason About Finance When Users Leave Things Unsaid?

Abstract page for arXiv paper 2602.07096: RealFin: How Well Do LLMs Reason About Finance When Users Leave Things Unsaid?

arXiv - AI · 3 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime