[2404.00962] Distributional Priors Guided Diffusion for Generating 3D Molecules in Low Data Regimes
About this article
Abstract page for arXiv paper 2404.00962: Distributional Priors Guided Diffusion for Generating 3D Molecules in Low Data Regimes
Computer Science > Machine Learning arXiv:2404.00962 (cs) [Submitted on 1 Apr 2024 (v1), last revised 2 Mar 2026 (this version, v2)] Title:Distributional Priors Guided Diffusion for Generating 3D Molecules in Low Data Regimes Authors:Haokai Hong, Wanyu Lin, Ming Yang, Kay Chen Tan View a PDF of the paper titled Distributional Priors Guided Diffusion for Generating 3D Molecules in Low Data Regimes, by Haokai Hong and 3 other authors View PDF Abstract:Can we train a 3D molecule generator using data from dense regions to generate samples in sparse regions? This challenge can be framed as an out-of-distribution (OOD) generation problem. While prior research on OOD generation predominantly targets property shifts, structural shifts -- such as differences in molecular scaffolds or functional groups -- represent an equally critical source of distributional shifts. This work introduces the Geometric OOD Diffusion Model (GODD), a novel diffusion-based framework that enables training on data-abundant molecular distributions while generalizing to data-scarce distributions under distributional structural shifts. Central to our approach is a designated equivariant asymmetric autoencoder to capture distributional structural priors. The asymmetric design allows the model to generalize to unseen structural variations by capturing distributional priors representing distinct distributions. The encoded structural-grained priors guide generation toward sparse regions without requiring explici...