[2507.18553] The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm
About this article
Abstract page for arXiv paper 2507.18553: The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm
Computer Science > Machine Learning arXiv:2507.18553 (cs) [Submitted on 24 Jul 2025 (v1), last revised 2 Mar 2026 (this version, v3)] Title:The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm Authors:Jiale Chen, Yalda Shabanzadeh, Elvir Crnčević, Torsten Hoefler, Dan Alistarh View a PDF of the paper titled The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm, by Jiale Chen and 4 other authors View PDF HTML (experimental) Abstract:Quantizing the weights of large language models (LLMs) from 16-bit to lower bitwidth is the de facto approach to deploy massive transformers onto more affordable accelerators. While GPTQ emerged as one of the standard methods for one-shot post-training quantization at LLM scale, its inner workings are described as a sequence of algebraic updates that obscure geometric meaning or worst-case guarantees. In this work, we show that, when executed back-to-front (from the last to first dimension) for a linear layer, GPTQ is mathematically identical to Babai's nearest plane algorithm for the classical closest vector problem (CVP) on a lattice defined by the Hessian matrix of the layer's inputs. This equivalence is based on a sophisticated mathematical argument, and has two analytical consequences: first, the GPTQ error propagation step gains an intuitive geometric interpretation; second, GPTQ inherits the error upper bound of Babai's algorithm under the assumption that no weights are clipped. Leveraging this bou...