[2603.19339] Spectral Tempering for Embedding Compression in Dense Passage Retrieval
About this article
Abstract page for arXiv paper 2603.19339: Spectral Tempering for Embedding Compression in Dense Passage Retrieval
Computer Science > Information Retrieval arXiv:2603.19339 (cs) [Submitted on 19 Mar 2026] Title:Spectral Tempering for Embedding Compression in Dense Passage Retrieval Authors:Yongkang Li, Panagiotis Eustratiadis, Evangelos Kanoulas View a PDF of the paper titled Spectral Tempering for Embedding Compression in Dense Passage Retrieval, by Yongkang Li and 2 other authors View PDF HTML (experimental) Abstract:Dimensionality reduction is critical for deploying dense retrieval systems at scale, yet mainstream post-hoc methods face a fundamental trade-off: principal component analysis (PCA) preserves dominant variance but underutilizes representational capacity, while whitening enforces isotropy at the cost of amplifying noise in the heavy-tailed eigenspectrum of retrieval embeddings. Intermediate spectral scaling methods unify these extremes by reweighting dimensions with a power coefficient $\gamma$, but treat $\gamma$ as a fixed hyperparameter that requires task-specific tuning. We show that the optimal scaling strength $\gamma$ is not a global constant: it varies systematically with target dimensionality $k$ and is governed by the signal-to-noise ratio (SNR) of the retained subspace. Based on this insight, we propose Spectral Tempering (\textbf{SpecTemp}), a learning-free method that derives an adaptive $\gamma(k)$ directly from the corpus eigenspectrum using local SNR analysis and knee-point normalization, requiring no labeled data or validation-based search. Extensive expe...