[2602.17849] Dual Length Codes for Lossless Compression of BFloat16
Summary
This paper presents Dual Length Codes, a novel hybrid approach for lossless compression of BFloat16 data, improving decoding speed while maintaining compression efficiency compared to traditional methods like Huffman coding.
Why It Matters
As large language models (LLMs) become more prevalent, efficient data compression techniques are crucial for optimizing network bandwidth and hardware performance. This research addresses the limitations of existing compression methods, potentially enhancing the efficiency of LLM training and deployment.
Key Takeaways
- Introduces Dual Length Codes for efficient lossless compression.
- Achieves a compressibility of 18.6%, slightly lower than Huffman codes but with faster decoding.
- Utilizes a simple Look Up Table for encoding and decoding, reducing hardware complexity.
Computer Science > Machine Learning arXiv:2602.17849 (cs) [Submitted on 19 Feb 2026] Title:Dual Length Codes for Lossless Compression of BFloat16 Authors:Aditya Agrawal, Albert Magyar, Hiteshwar Eswaraiah, Patrick Sheridan, Pradeep Janedula, Ravi Krishnan Venkatesan, Krishna Nair, Ravi Iyer View a PDF of the paper titled Dual Length Codes for Lossless Compression of BFloat16, by Aditya Agrawal and 7 other authors View PDF HTML (experimental) Abstract:Training and serving Large Language Models (LLMs) relies heavily on parallelization and collective operations, which are frequently bottlenecked by network bandwidth. Lossless compression using e.g., Huffman codes can alleviate the issue, however, Huffman codes suffer from slow, bit-sequential decoding and high hardware complexity due to deep tree traversals. Universal codes e.g., Exponential-Golomb codes are faster to decode but do not exploit the symbol frequency distributions. To address these limitations, this paper introduces Dual Length Codes, a hybrid approach designed to balance compression efficiency with decoding speed. Analyzing BFloat16 tensors from the Gemma model, we observed that the top 8 most frequent symbols account for approximately 50% of the cumulative probability. These 8 symbols are assigned a short 4 bit code. The remaining 248 symbols are assigned a longer 9 bit code. The coding scheme uses a single prefix bit to distinguish between the two code lengths. The scheme uses a small Look Up Table with only ...