[2604.02119] AA-SVD : Anchored and Adaptive SVD for Large Language Model Compression
About this article
Abstract page for arXiv paper 2604.02119: AA-SVD : Anchored and Adaptive SVD for Large Language Model Compression
Computer Science > Machine Learning arXiv:2604.02119 (cs) [Submitted on 2 Apr 2026] Title:AA-SVD : Anchored and Adaptive SVD for Large Language Model Compression Authors:Atul Kumar Sinha, François Fleuret View a PDF of the paper titled AA-SVD : Anchored and Adaptive SVD for Large Language Model Compression, by Atul Kumar Sinha and Fran\c{c}ois Fleuret View PDF HTML (experimental) Abstract:We introduce a fast low-rank factorization-based framework for compressing large language models that enables rapid compression of billion-parameter models without retraining. Unlike existing factorization-based approaches that optimize only on the original inputs, ignoring distribution shifts from upstream compression and thus propagating errors forward, or those that rely only on shifted inputs and risk drifting away from the original outputs, our approach accounts for both. Beyond individual layer compression, we further refine each transformer block end-to-end, minimizing block-level output distortion and allowing compressed layers to jointly compensate for accumulated errors. By anchoring each compressed layer to the original outputs while explicitly modeling input distribution shifts, our method finds a low-rank approximation that maintains functional equivalence with the original model. Experiments on large language models show that our method consistently outperforms existing SVD-based baselines across compression ratios, with the advantage becoming increasingly pronounced at aggr...