[2602.17681] LATMiX: Learnable Affine Transformations for Microscaling Quantization of LLMs
Summary
The paper presents LATMiX, a method for enhancing quantization in large language models (LLMs) through learnable affine transformations, improving accuracy and efficiency.
Why It Matters
As large language models become increasingly resource-intensive, optimizing their performance through innovative quantization techniques is crucial. LATMiX addresses limitations in current methods, potentially leading to more efficient deployment of LLMs across various applications.
Key Takeaways
- LATMiX introduces learnable invertible affine transformations for better quantization.
- The method shows improved accuracy in low-bit quantization scenarios.
- Theoretical analysis highlights the importance of activation distribution in quantization.
Computer Science > Machine Learning arXiv:2602.17681 (cs) [Submitted on 4 Feb 2026] Title:LATMiX: Learnable Affine Transformations for Microscaling Quantization of LLMs Authors:Ofir Gordon, Lior Dikstein, Arnon Netzer, Idan Achituve, Hai Victor Habi View a PDF of the paper titled LATMiX: Learnable Affine Transformations for Microscaling Quantization of LLMs, by Ofir Gordon and 4 other authors View PDF HTML (experimental) Abstract:Post-training quantization (PTQ) is a widely used approach for reducing the memory and compute costs of large language models (LLMs). Recent studies have shown that applying invertible transformations to activations can significantly improve quantization robustness by reducing activation outliers; however, existing approaches are largely restricted to rotation or Hadamard-based transformations. Moreover, most studies focused primarily on traditional quantization schemes, whereas modern hardware increasingly supports the microscaling (MX) data format. Attempts to combine both showed severe performance degradation, leading prior work to introduce assumptions on the transformations. In this work, we take a complementary perspective. First, we provide a theoretical analysis of transformations under MX quantization by deriving a bound on the quantization error. Our analysis emphasizes the importance of accounting for both the activation distribution and the underlying quantization structure. Building on this analysis, we propose LATMiX, a method that g...