[2310.17167] Improving Denoising Diffusion Models via Simultaneous Estimation of Image and Noise
Summary
This paper presents advancements in denoising diffusion models, focusing on simultaneous estimation of image and noise to enhance image generation speed and quality.
Why It Matters
As generative models become increasingly integral in AI applications, improving the efficiency and quality of image generation is crucial. This research addresses key limitations in current diffusion models, potentially leading to faster and more accurate image synthesis, which is vital for fields like computer vision and machine learning.
Key Takeaways
- Introduces a reparameterization of the diffusion process for better stability.
- Enables simultaneous estimation of image and noise for improved calculations.
- Achieves faster image generation and higher quality outputs as measured by FID metrics.
Computer Science > Machine Learning arXiv:2310.17167 (cs) [Submitted on 26 Oct 2023 (v1), last revised 25 Feb 2026 (this version, v2)] Title:Improving Denoising Diffusion Models via Simultaneous Estimation of Image and Noise Authors:Zhenkai Zhang, Krista A. Ehinger, Tom Drummond View a PDF of the paper titled Improving Denoising Diffusion Models via Simultaneous Estimation of Image and Noise, by Zhenkai Zhang and 1 other authors View PDF HTML (experimental) Abstract:This paper introduces two key contributions aimed at improving the speed and quality of images generated through inverse diffusion processes. The first contribution involves reparameterizing the diffusion process in terms of the angle on a quarter-circular arc between the image and noise, specifically setting the conventional $\displaystyle \sqrt{\bar{\alpha}}=\cos(\eta)$. This reparameterization eliminates two singularities and allows for the expression of diffusion evolution as a well-behaved ordinary differential equation (ODE). In turn, this allows higher order ODE solvers such as Runge-Kutta methods to be used effectively. The second contribution is to directly estimate both the image ($\mathbf{x}_0$) and noise ($\mathbf{\epsilon}$) using our network, which enables more stable calculations of the update step in the inverse diffusion steps, as accurate estimation of both the image and noise are crucial at different stages of the process. Together with these changes, our model achieves faster generation, wit...