[2603.22375] Three Creates All: You Only Sample 3 Steps
About this article
Abstract page for arXiv paper 2603.22375: Three Creates All: You Only Sample 3 Steps
Computer Science > Machine Learning arXiv:2603.22375 (cs) [Submitted on 23 Mar 2026] Title:Three Creates All: You Only Sample 3 Steps Authors:Yuren Cai, Guangyi Wang, Zongqing Li, Li Li, Zhihui Liu, Songzhi Su View a PDF of the paper titled Three Creates All: You Only Sample 3 Steps, by Yuren Cai and 5 other authors View PDF HTML (experimental) Abstract:Diffusion models deliver high-fidelity generation but remain slow at inference time due to many sequential network evaluations. We find that standard timestep conditioning becomes a key bottleneck for few-step sampling. Motivated by layer-dependent denoising dynamics, we propose Multi-layer Time Embedding Optimization (MTEO), which freeze the pretrained diffusion backbone and distill a small set of step-wise, layer-wise time embeddings from reference trajectories. MTEO is plug-and-play with existing ODE solvers, adds no inference-time overhead, and trains only a tiny fraction of parameters. Extensive experiments across diverse datasets and backbones show state-of-the-art performance in the few-step sampling and substantially narrow the gap between distillation-based and lightweight methods. Code will be available. Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV) Cite as: arXiv:2603.22375 [cs.LG] (or arXiv:2603.22375v1 [cs.LG] for this version) https://doi.org/10.48550/arXiv.2603.22375 Focus to learn more arXiv-issued DOI via DataCite Submission history ...