[2505.12988] Optimal Formats for Weight Quantisation

arXiv - Machine Learning February 16, 2026 4 min read Article

Summary

This paper presents a systematic framework for designing weight quantisation formats in deep learning, demonstrating that variable-length codes can enhance performance and reduce model size.

Why It Matters

Weight quantisation is crucial for optimizing deep learning models, especially in resource-constrained environments. This research provides a structured approach to improve quantisation formats, potentially leading to more efficient AI applications and advancements in model deployment.

Key Takeaways

Proposes a framework for systematic design of quantisation formats.
Highlights the advantages of variable-length coding in quantisation.
Demonstrates improved performance of non-linear quantisation curves.
Shows potential savings of up to 0.25 bits per parameter in large language models.
Connects quantisation design with classical quantisation theory.

Computer Science > Machine Learning arXiv:2505.12988 (cs) [Submitted on 19 May 2025 (v1), last revised 13 Feb 2026 (this version, v3)] Title:Optimal Formats for Weight Quantisation Authors:Douglas Orr, Luka Ribar, Carlo Luschi View a PDF of the paper titled Optimal Formats for Weight Quantisation, by Douglas Orr and 2 other authors View PDF Abstract:Weight quantisation is an essential technique for enabling efficient training and deployment of modern deep learning models. However, the recipe book of quantisation formats is large and formats are often chosen empirically. In this paper, we propose a framework for systematic design and analysis of quantisation formats. By connecting the question of format design with the classical quantisation theory, we show that the strong practical performance of popular formats comes from their ability to represent values using variable-length codes. We frame the problem as minimising the KL divergence between original and quantised model outputs under a model size constraint, which can be approximated by minimising the squared quantisation error, a well-studied problem where entropy-constrained quantisers with variable-length codes are optimal. We develop non-linear quantisation curves for block-scaled data across multiple distribution families and observe that these formats, along with sparse outlier formats, consistently outperform fixed-length formats, indicating that they also exploit variable-length encoding. Finally, by using the r...

Read Original Article