[2604.04155] The Geometric Alignment Tax: Tokenization vs. Continuous

[2604.04155] The Geometric Alignment Tax: Tokenization vs. Continuous Geometry in Scientific Foundation Models

arXiv - Machine Learning April 07, 2026 3 min read

About this article

Abstract page for arXiv paper 2604.04155: The Geometric Alignment Tax: Tokenization vs. Continuous Geometry in Scientific Foundation Models

Computer Science > Machine Learning arXiv:2604.04155 (cs) [Submitted on 5 Apr 2026] Title:The Geometric Alignment Tax: Tokenization vs. Continuous Geometry in Scientific Foundation Models Authors:Prashant C. Raju View a PDF of the paper titled The Geometric Alignment Tax: Tokenization vs. Continuous Geometry in Scientific Foundation Models, by Prashant C. Raju View PDF HTML (experimental) Abstract:Foundation models for biology and physics optimize predictive accuracy, but their internal representations systematically fail to preserve the continuous geometry of the systems they model. We identify the root cause: the Geometric Alignment Tax, an intrinsic cost of forcing continuous manifolds through discrete categorical bottlenecks. Controlled ablations on synthetic dynamical systems demonstrate that replacing cross-entropy with a continuous head on an identical encoder reduces geometric distortion by up to 8.5x, while learned codebooks exhibit a non-monotonic double bind where finer quantization worsens geometry despite improving reconstruction. Under continuous objectives, three architectures differ by 1.3x; under discrete tokenization, they diverge by 3,000x. Evaluating 14 biological foundation models with rate-distortion theory and MINE, we identify three failure regimes: Local-Global Decoupling, Representational Compression, and Geometric Vacuity. A controlled experiment confirms that Evo 2's reverse-complement robustness on real DNA reflects conserved sequence compositi...

Originally published on April 07, 2026. Curated by AI News.

Llms

[2602.07238] Is there "Secret Sauce'' in Large Language Model Development?

Abstract page for arXiv paper 2602.07238: Is there "Secret Sauce'' in Large Language Model Development?

arXiv - Machine Learning · 3 min · about 5 hours ago

Llms

[2602.01203] Attention Sink Forges Native MoE in Attention Layers: Sink-Aware Training to Address Head Collapse

Abstract page for arXiv paper 2602.01203: Attention Sink Forges Native MoE in Attention Layers: Sink-Aware Training to Address Head Collapse

arXiv - Machine Learning · 4 min · about 5 hours ago

Llms

[2601.01322] LinMU: Multimodal Understanding Made Linear

Abstract page for arXiv paper 2601.01322: LinMU: Multimodal Understanding Made Linear

arXiv - Machine Learning · 4 min · about 5 hours ago

Llms

[2512.05525] Poodle: Seamlessly Scaling Down Large Language Models with Just-in-Time Model Replacement

Abstract page for arXiv paper 2512.05525: Poodle: Seamlessly Scaling Down Large Language Models with Just-in-Time Model Replacement

arXiv - Machine Learning · 4 min · about 5 hours ago

[2604.04155] The Geometric Alignment Tax: Tokenization vs. Continuous Geometry in Scientific Foundation Models

About this article

Related Articles

[2602.07238] Is there "Secret Sauce'' in Large Language Model Development?

[2602.01203] Attention Sink Forges Native MoE in Attention Layers: Sink-Aware Training to Address Head Collapse

[2601.01322] LinMU: Multimodal Understanding Made Linear

[2512.05525] Poodle: Seamlessly Scaling Down Large Language Models with Just-in-Time Model Replacement

No comments

Stay updated with AI News