[2602.23968] Learning Generation Orders for Masked Discrete Diffusion Models via Variational Inference
About this article
Abstract page for arXiv paper 2602.23968: Learning Generation Orders for Masked Discrete Diffusion Models via Variational Inference
Computer Science > Machine Learning arXiv:2602.23968 (cs) [Submitted on 27 Feb 2026] Title:Learning Generation Orders for Masked Discrete Diffusion Models via Variational Inference Authors:David Fox, Sam Bowyer, Song Liu, Laurence Aitchison, Raul Santos-Rodriguez, Mengyue Yang View a PDF of the paper titled Learning Generation Orders for Masked Discrete Diffusion Models via Variational Inference, by David Fox and 5 other authors View PDF Abstract:Masked discrete diffusion models (MDMs) are a promising new approach to generative modelling, offering the ability for parallel token generation and therefore greater efficiency than autoregressive counterparts. However, achieving an optimal balance between parallel generation and sample quality remains an open problem. Current approaches primarily address this issue through fixed, heuristic parallel sampling methods. There exist some recent learning based approaches to this problem, but its formulation from the perspective of variational inference remains underexplored. In this work, we propose a variational inference framework for learning parallel generation orders for MDMs. As part of our method, we propose a parameterisation for the approximate posterior of generation orders which facilitates parallelism and efficient sampling during training. Using this method, we conduct preliminary experiments on the GSM8K dataset, where our method performs competitively against heuristic sampling strategies in the regime of highly paralle...