[2603.00045] Breaking the Factorization Barrier in Diffusion Language Models
About this article
Abstract page for arXiv paper 2603.00045: Breaking the Factorization Barrier in Diffusion Language Models
Computer Science > Machine Learning arXiv:2603.00045 (cs) [Submitted on 9 Feb 2026] Title:Breaking the Factorization Barrier in Diffusion Language Models Authors:Ian Li, Zilei Shao, Benjie Wang, Rose Yu, Guy Van den Broeck, Anji Liu View a PDF of the paper titled Breaking the Factorization Barrier in Diffusion Language Models, by Ian Li and 5 other authors View PDF HTML (experimental) Abstract:Diffusion language models theoretically allow for efficient parallel generation but are practically hindered by the "factorization barrier": the assumption that simultaneously predicted tokens are independent. This limitation forces a trade-off: models must either sacrifice speed by resolving dependencies sequentially or suffer from incoherence due to factorization. We argue that this barrier arises not from limited backbone expressivity, but from a structural misspecification: models are restricted to fully factorized outputs because explicitly parameterizing a joint distribution would require the Transformer to output a prohibitively large number of parameters. We propose Coupled Discrete Diffusion (CoDD), a hybrid framework that breaks this barrier by replacing the fully-factorized output distribution with a lightweight, tractable probabilistic inference layer. This formulation yields a distribution family that is significantly more expressive than standard factorized priors, enabling the modeling of complex joint dependencies, yet remains compact enough to avoid the prohibitive p...