[2602.00079] Embedding Compression via Spherical Coordinates
About this article
Abstract page for arXiv paper 2602.00079: Embedding Compression via Spherical Coordinates
Computer Science > Machine Learning arXiv:2602.00079 (cs) [Submitted on 22 Jan 2026 (v1), last revised 25 Mar 2026 (this version, v4)] Title:Embedding Compression via Spherical Coordinates Authors:Han Xiao View a PDF of the paper titled Embedding Compression via Spherical Coordinates, by Han Xiao View PDF HTML (experimental) Abstract:We present an $\epsilon$-bounded compression method for unit-norm embeddings that achieves 1.5$\times$ compression, 25% better than the best prior lossless method. The method exploits that spherical coordinates of high-dimensional unit vectors concentrate around $\pi/2$, causing IEEE 754 exponents to collapse to a single value and high-order mantissa bits to become predictable, enabling entropy coding of both. Reconstruction error is bounded by float32 machine epsilon ($1.19 \times 10^{-7}$), making reconstructed values indistinguishable from originals at float32 precision. Evaluation across 26 configurations spanning text, image, and multi-vector embeddings confirms consistent compression improvement with zero measurable retrieval degradation on BEIR benchmarks. Comments: Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV) MSC classes: 68T50 ACM classes: I.2.7 Cite as: arXiv:2602.00079 [cs.LG] (or arXiv:2602.00079v4 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2602.00079 Focus to learn more arXiv-issued DOI via DataCite Submission history From: Han Xiao [view email] [v1] Thu, 22 Jan 2026 03:21:0...