[2602.18904] PCA-VAE: Differentiable Subspace Quantization without Codebook Collapse
Summary
The paper introduces PCA-VAE, a novel approach to vector-quantized autoencoders that replaces traditional quantization methods with a differentiable PCA-based alternative, enhancing reconstruction quality while reducing complexity.
Why It Matters
PCA-VAE addresses critical limitations in existing vector quantization methods by providing a mathematically grounded, stable, and efficient alternative. This innovation could significantly impact generative models in machine learning and computer vision, offering a new direction for future research and applications.
Key Takeaways
- PCA-VAE replaces non-differentiable quantizers with a differentiable PCA bottleneck.
- It achieves superior reconstruction quality with significantly fewer latent bits compared to VQ-GAN and SimVQ.
- The model produces interpretable dimensions without the need for adversarial regularization.
- PCA-VAE offers a viable alternative to vector quantization, emphasizing stability and efficiency.
- This approach opens new avenues for generative model development beyond traditional methods.
Computer Science > Machine Learning arXiv:2602.18904 (cs) [Submitted on 21 Feb 2026] Title:PCA-VAE: Differentiable Subspace Quantization without Codebook Collapse Authors:Hao Lu, Onur C. Koyun, Yongxin Guo, Zhengjie Zhu, Abbas Alili, Metin Nafi Gurcan View a PDF of the paper titled PCA-VAE: Differentiable Subspace Quantization without Codebook Collapse, by Hao Lu and 5 other authors View PDF HTML (experimental) Abstract:Vector-quantized autoencoders deliver high-fidelity latents but suffer inherent flaws: the quantizer is non-differentiable, requires straight-through hacks, and is prone to collapse. We address these issues at the root by replacing VQ with a simple, principled, and fully differentiable alternative: an online PCA bottleneck trained via Oja's rule. The resulting model, PCA-VAE, learns an orthogonal, variance-ordered latent basis without codebooks, commitment losses, or lookup noise. Despite its simplicity, PCA-VAE exceeds VQ-GAN and SimVQ in reconstruction quality on CelebAHQ while using 10-100x fewer latent bits. It also produces naturally interpretable dimensions (e.g., pose, lighting, gender cues) without adversarial regularization or disentanglement objectives. These results suggest that PCA is a viable replacement for VQ: mathematically grounded, stable, bit-efficient, and semantically structured, offering a new direction for generative models beyond vector quantization. Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV) ...