[2602.20293] Discrete Diffusion with Sample-Efficient Estimators for Conditionals
Summary
This paper presents a novel discrete denoising diffusion framework that utilizes a sample-efficient estimator for single-site conditionals, enhancing generative modeling in discrete state spaces.
Why It Matters
The research addresses the efficiency of generative models in machine learning, particularly in discrete settings, which is crucial for applications in various fields such as quantum computing and statistical physics. By improving the estimation of conditionals, the findings could lead to more effective models in practice.
Key Takeaways
- Introduces a discrete diffusion framework for generative modeling.
- Utilizes the Neural Interaction Screening Estimator (NeurISE) for efficient conditional estimation.
- Demonstrates superior performance over existing methods on binary datasets.
- Controlled experiments validate the approach using synthetic and real-world datasets.
- Implications for advancements in machine learning applications in quantum systems.
Computer Science > Machine Learning arXiv:2602.20293 (cs) [Submitted on 23 Feb 2026] Title:Discrete Diffusion with Sample-Efficient Estimators for Conditionals Authors:Karthik Elamvazhuthi, Abhijith Jayakumar, Andrey Y. Lokhov View a PDF of the paper titled Discrete Diffusion with Sample-Efficient Estimators for Conditionals, by Karthik Elamvazhuthi and 2 other authors View PDF HTML (experimental) Abstract:We study a discrete denoising diffusion framework that integrates a sample-efficient estimator of single-site conditionals with round-robin noising and denoising dynamics for generative modeling over discrete state spaces. Rather than approximating a discrete analog of a score function, our formulation treats single-site conditional probabilities as the fundamental objects that parameterize the reverse diffusion process. We employ a sample-efficient method known as Neural Interaction Screening Estimator (NeurISE) to estimate these conditionals in the diffusion dynamics. Controlled experiments on synthetic Ising models, MNIST, and scientific data sets produced by a D-Wave quantum annealer, synthetic Potts model and one-dimensional quantum systems demonstrate the proposed approach. On the binary data sets, these experiments demonstrate that the proposed approach outperforms popular existing methods including ratio-based approaches, achieving improved performance in total variation, cross-correlations, and kernel density estimation metrics. Subjects: Machine Learning (cs.LG...