[2602.13303] Spectral Collapse in Diffusion Inversion
Summary
The paper discusses 'spectral collapse' in diffusion inversion, highlighting failures in standard deterministic methods for image translation and proposing a new method, Orthogonal Variance Guidance (OVG), to enhance texture fidelity.
Why It Matters
This research addresses significant limitations in current image-to-image translation techniques, particularly in contexts where the source domain lacks spectral richness. By introducing OVG, the authors provide a solution that could improve the quality of generated images, which is crucial for applications in computer vision and machine learning.
Key Takeaways
- Standard deterministic inversion methods fail under spectral sparsity.
- Spectral collapse leads to oversmoothed and texture-poor image generations.
- Orthogonal Variance Guidance (OVG) effectively restores texture while maintaining structure.
- Extensive experiments validate OVG's performance in microscopy and sketch-to-image tasks.
- The findings have implications for improving generative models in AI.
Computer Science > Computer Vision and Pattern Recognition arXiv:2602.13303 (cs) [Submitted on 9 Feb 2026] Title:Spectral Collapse in Diffusion Inversion Authors:Nicolas Bourriez, Alexandre Verine, Auguste Genovesio View a PDF of the paper titled Spectral Collapse in Diffusion Inversion, by Nicolas Bourriez and 2 other authors View PDF HTML (experimental) Abstract:Conditional diffusion inversion provides a powerful framework for unpaired image-to-image translation. However, we demonstrate through an extensive analysis that standard deterministic inversion (e.g. DDIM) fails when the source domain is spectrally sparse compared to the target domain (e.g., super-resolution, sketch-to-image). In these contexts, the recovered latent from the input does not follow the expected isotropic Gaussian distribution. Instead it exhibits a signal with lower frequencies, locking target sampling to oversmoothed and texture-poor generations. We term this phenomenon spectral collapse. We observe that stochastic alternatives attempting to restore the noise variance tend to break the semantic link to the input, leading to structural drift. To resolve this structure-texture trade-off, we propose Orthogonal Variance Guidance (OVG), an inference-time method that corrects the ODE dynamics to enforce the theoretical Gaussian noise magnitude within the null-space of the structural gradient. Extensive experiments on microscopy super-resolution (BBBC021) and sketch-to-image (Edges2Shoes) demonstrate th...