[2505.13213] Diffusion Models with Double Guidance: Generate with aggregated datasets
About this article
Abstract page for arXiv paper 2505.13213: Diffusion Models with Double Guidance: Generate with aggregated datasets
Statistics > Machine Learning arXiv:2505.13213 (stat) [Submitted on 19 May 2025 (v1), last revised 29 Mar 2026 (this version, v2)] Title:Diffusion Models with Double Guidance: Generate with aggregated datasets Authors:Yanfeng Yang, Kenji Fukumizu View a PDF of the paper titled Diffusion Models with Double Guidance: Generate with aggregated datasets, by Yanfeng Yang and Kenji Fukumizu View PDF HTML (experimental) Abstract:Creating large-scale datasets for training high-performance generative models is often prohibitively expensive, especially when associated attributes or annotations must be provided. As a result, merging existing datasets has become a common strategy. However, the sets of attributes across datasets are often inconsistent, and their naive concatenation typically leads to block-wise missing conditions. This presents a significant challenge for conditional generative modeling when the multiple attributes are used jointly as conditions, thereby limiting the model's controllability and applicability. To address this issue, we propose a novel generative approach, Diffusion Model with Double Guidance, which enables precise conditional generation even when no training samples contain all conditions simultaneously. Our method maintains rigorous control over multiple conditions without requiring joint annotations. We demonstrate its effectiveness in molecular and image generation tasks, where it outperforms existing baselines both in alignment with target conditiona...