[2602.17270] Unified Latents (UL): How to train your latents
Summary
The paper introduces Unified Latents (UL), a framework for training latent representations using a diffusion prior, achieving competitive results in image and video datasets.
Why It Matters
This research is significant as it presents a novel approach to training latent representations that enhances efficiency and performance in machine learning models, particularly in generative tasks. The results indicate potential advancements in both image and video processing, which are crucial for applications in AI and computer vision.
Key Takeaways
- Unified Latents (UL) framework improves training of latent representations.
- Achieves competitive FID of 1.4 on ImageNet-512 with high reconstruction quality.
- Sets a new state-of-the-art FVD of 1.3 on Kinetics-600.
- Requires fewer training FLOPs compared to models using Stable Diffusion latents.
- Links encoder's output noise to the prior's minimum noise level for effective training.
Computer Science > Machine Learning arXiv:2602.17270 (cs) [Submitted on 19 Feb 2026] Title:Unified Latents (UL): How to train your latents Authors:Jonathan Heek, Emiel Hoogeboom, Thomas Mensink, Tim Salimans View a PDF of the paper titled Unified Latents (UL): How to train your latents, by Jonathan Heek and 3 other authors View PDF Abstract:We present Unified Latents (UL), a framework for learning latent representations that are jointly regularized by a diffusion prior and decoded by a diffusion model. By linking the encoder's output noise to the prior's minimum noise level, we obtain a simple training objective that provides a tight upper bound on the latent bitrate. On ImageNet-512, our approach achieves competitive FID of 1.4, with high reconstruction quality (PSNR) while requiring fewer training FLOPs than models trained on Stable Diffusion latents. On Kinetics-600, we set a new state-of-the-art FVD of 1.3. Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV) Cite as: arXiv:2602.17270 [cs.LG] (or arXiv:2602.17270v1 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2602.17270 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submission history From: Jonathan Heek [view email] [v1] Thu, 19 Feb 2026 11:18:12 UTC (8,477 KB) Full-text links: Access Paper: View a PDF of the paper titled Unified Latents (UL): How to train your latents, by Jonathan Heek and 3 other authorsView PDFTeX Source view license Current b...