[2602.16086] LGQ: Learning Discretization Geometry for Scalable and Stable Image Tokenization
Summary
The paper presents LGQ, a novel image tokenizer that learns discretization geometry to enhance scalability and stability in visual generation, outperforming existing methods in efficiency and representation.
Why It Matters
As image generation technologies advance, efficient tokenization becomes crucial for maintaining quality while reducing computational demands. LGQ addresses the limitations of current quantizers, making it significant for researchers and practitioners in computer vision and machine learning.
Key Takeaways
- LGQ introduces a learnable discretization geometry for image tokenization.
- It replaces hard nearest-neighbor lookups with soft assignments for better training.
- The method achieves improved performance metrics with fewer active codes.
- LGQ balances code utilization without rigid geometrical constraints.
- The approach is validated on ImageNet, demonstrating significant enhancements over existing tokenizers.
Computer Science > Computer Vision and Pattern Recognition arXiv:2602.16086 (cs) [Submitted on 17 Feb 2026] Title:LGQ: Learning Discretization Geometry for Scalable and Stable Image Tokenization Authors:Idil Bilge Altun, Mert Onur Cakiroglu, Elham Buxton, Mehmet Dalkilic, Hasan Kurban View a PDF of the paper titled LGQ: Learning Discretization Geometry for Scalable and Stable Image Tokenization, by Idil Bilge Altun and 4 other authors View PDF HTML (experimental) Abstract:Discrete image tokenization is a key bottleneck for scalable visual generation: a tokenizer must remain compact for efficient latent-space priors while preserving semantic structure and using discrete capacity effectively. Existing quantizers face a trade-off: vector-quantized tokenizers learn flexible geometries but often suffer from biased straight-through optimization, codebook under-utilization, and representation collapse at large vocabularies. Structured scalar or implicit tokenizers ensure stable, near-complete utilization by design, yet rely on fixed discretization geometries that may allocate capacity inefficiently under heterogeneous latent statistics. We introduce Learnable Geometric Quantization (LGQ), a discrete image tokenizer that learns discretization geometry end-to-end. LGQ replaces hard nearest-neighbor lookup with temperature-controlled soft assignments, enabling fully differentiable training while recovering hard assignments at inference. The assignments correspond to posterior respon...