[2603.22248] Confidence-Based Decoding is Provably Efficient for Diffusion Language Models
About this article
Abstract page for arXiv paper 2603.22248: Confidence-Based Decoding is Provably Efficient for Diffusion Language Models
Computer Science > Machine Learning arXiv:2603.22248 (cs) [Submitted on 23 Mar 2026] Title:Confidence-Based Decoding is Provably Efficient for Diffusion Language Models Authors:Changxiao Cai, Gen Li View a PDF of the paper titled Confidence-Based Decoding is Provably Efficient for Diffusion Language Models, by Changxiao Cai and 1 other authors View PDF HTML (experimental) Abstract:Diffusion language models (DLMs) have emerged as a promising alternative to autoregressive (AR) models for language modeling, allowing flexible generation order and parallel generation of multiple tokens. However, this flexibility introduces a challenge absent in AR models: the \emph{decoding strategy} -- which determines the order and number of tokens generated at each iteration -- critically affects sampling efficiency. Among decoding strategies explored in practice, confidence-based methods, which adaptively select which and how many tokens to unmask based on prediction confidence, have shown strong empirical performance. Despite this success, our theoretical understanding of confidence-based decoding remains limited. In this work, we develop the first theoretical analysis framework for confidence-based decoding in DLMs. We focus on an entropy sum-based strategy that continues unmasking tokens within each iteration until the cumulative entropy exceeds a threshold, and show that it achieves $\varepsilon$-accurate sampling in KL divergence with an expected number of iterations $\widetilde O(H(X_...