[2602.15200] COMPOT: Calibration-Optimized Matrix Procrustes Orthogonalization for Transformers Compression
Summary
The paper presents COMPOT, a novel framework for compressing Transformer models using Calibration-Optimized Matrix Procrustes Orthogonalization, enhancing efficiency without sacrificing accuracy.
Why It Matters
As Transformer models grow in size, efficient compression methods are essential for deployment in resource-constrained environments. COMPOT addresses the limitations of existing methods by offering a training-free approach that maintains model performance while reducing size, making it relevant for researchers and practitioners in machine learning and AI.
Key Takeaways
- COMPOT offers a training-free compression method for Transformers.
- Utilizes orthogonal dictionaries for efficient weight factorization.
- Introduces a dynamic allocation strategy for layer-wise compression.
- Demonstrates superior quality-compression trade-off compared to existing methods.
- Compatible with post-training quantization for extreme compression.
Computer Science > Machine Learning arXiv:2602.15200 (cs) [Submitted on 16 Feb 2026] Title:COMPOT: Calibration-Optimized Matrix Procrustes Orthogonalization for Transformers Compression Authors:Denis Makhov, Dmitriy Shopkhoev, Magauiya Zhussip, Ammar Ali, Baher Mohammad, Stamatios Lefkimmiatis View a PDF of the paper titled COMPOT: Calibration-Optimized Matrix Procrustes Orthogonalization for Transformers Compression, by Denis Makhov and 5 other authors View PDF HTML (experimental) Abstract:Post-training compression of Transformer models commonly relies on truncated singular value decomposition (SVD). However, enforcing a single shared subspace can degrade accuracy even at moderate compression. Sparse dictionary learning provides a more flexible union-of-subspaces representation, but existing approaches often suffer from iterative dictionary and coefficient updates. We propose COMPOT (Calibration-Optimized Matrix Procrustes Orthogonalization for Transformers), a training-free compression framework that uses a small calibration dataset to estimate a sparse weight factorization. COMPOT employs orthogonal dictionaries that enable closed-form Procrustes updates for the dictionary and analytical single-step sparse coding for the coefficients, eliminating iterative optimization. To handle heterogeneous layer sensitivity under a global compression budget, COMPOT further introduces a one-shot dynamic allocation strategy that adaptively redistributes layer-wise compression rates. E...