[2509.22075] CoSpaDi: Compressing LLMs via Calibration-Guided Sparse Dictionary Learning
Summary
The paper introduces CoSpaDi, a novel framework for compressing large language models (LLMs) using calibration-guided sparse dictionary learning, improving accuracy and efficiency over traditional methods.
Why It Matters
As LLMs grow in size, efficient compression techniques are crucial for deployment in resource-constrained environments. CoSpaDi offers a promising alternative to existing methods, potentially enhancing model performance while reducing computational costs.
Key Takeaways
- CoSpaDi replaces low-rank factorization with structured sparse decomposition for better model expressiveness.
- The framework minimizes functional reconstruction error using a calibration set, enhancing accuracy.
- CoSpaDi achieves 20-40% compression ratios while maintaining performance, outperforming SVD-based methods.
Computer Science > Computation and Language arXiv:2509.22075 (cs) [Submitted on 26 Sep 2025 (v1), last revised 19 Feb 2026 (this version, v4)] Title:CoSpaDi: Compressing LLMs via Calibration-Guided Sparse Dictionary Learning Authors:Denis Makhov, Dmitriy Shopkhoev, Magauiya Zhussip, Ammar Ali, Stamatios Lefkimmiatis View a PDF of the paper titled CoSpaDi: Compressing LLMs via Calibration-Guided Sparse Dictionary Learning, by Denis Makhov and 4 other authors View PDF HTML (experimental) Abstract:Post-training compression of large language models (LLMs) often relies on low-rank weight approximations that represent each column of the weight matrix in a shared low-dimensional subspace. This strategy is computationally efficient but the underlying constraint can be overly rigid for heterogeneous projection weights and may incur avoidable accuracy loss. We propose CoSpaDi (Compression via Sparse Dictionary Learning), a training-free framework that replaces low-rank factorization with a structured sparse decomposition in which each weight matrix is represented as a dense dictionary multiplied by a column-sparse coefficient matrix. This yields a union-of-subspaces model: the columns of the weight matrix are represented as linear combinations of different subsets of dictionary atoms, improving expressiveness at a fixed parameter budget. CoSpaDi is calibration-guided: using a small calibration set, we optimize the factorization to minimize functional reconstruction error of layer ou...