[2602.19461] Laplacian Multi-scale Flow Matching for Generative Modeling
Summary
The paper presents Laplacian Multi-scale Flow Matching (LapFlow), a new framework for image generative modeling that enhances flow matching by using multi-scale representations, improving quality and efficiency in generating high-resolution images.
Why It Matters
This research is significant as it addresses the limitations of traditional flow matching methods in generative modeling. By utilizing a multi-scale approach, it not only enhances the quality of generated images but also reduces computational overhead, making it a valuable contribution to the fields of computer vision and machine learning.
Key Takeaways
- LapFlow improves generative modeling by using multi-scale representations.
- The model processes different scales in parallel, enhancing efficiency.
- It achieves superior sample quality with lower computational costs.
- The architecture supports high-resolution image generation up to 1024x1024.
- Extensive experiments demonstrate its effectiveness over existing methods.
Computer Science > Computer Vision and Pattern Recognition arXiv:2602.19461 (cs) [Submitted on 23 Feb 2026] Title:Laplacian Multi-scale Flow Matching for Generative Modeling Authors:Zelin Zhao, Petr Molodyk, Haotian Xue, Yongxin Chen View a PDF of the paper titled Laplacian Multi-scale Flow Matching for Generative Modeling, by Zelin Zhao and 3 other authors View PDF HTML (experimental) Abstract:In this paper, we present Laplacian multiscale flow matching (LapFlow), a novel framework that enhances flow matching by leveraging multi-scale representations for image generative modeling. Our approach decomposes images into Laplacian pyramid residuals and processes different scales in parallel through a mixture-of-transformers (MoT) architecture with causal attention mechanisms. Unlike previous cascaded approaches that require explicit renoising between scales, our model generates multi-scale representations in parallel, eliminating the need for bridging processes. The proposed multi-scale architecture not only improves generation quality but also accelerates the sampling process and promotes scaling flow matching methods. Through extensive experimentation on CelebA-HQ and ImageNet, we demonstrate that our method achieves superior sample quality with fewer GFLOPs and faster inference compared to single-scale and multi-scale flow matching baselines. The proposed model scales effectively to high-resolution generation (up to 1024$\times$1024) while maintaining lower computational ov...