[2602.07132] Discrete Adjoint Matching
Summary
The paper introduces Discrete Adjoint Matching (DAM), a novel method for fine-tuning discrete generative models, addressing challenges in transferring continuous state optimization techniques to discrete domains.
Why It Matters
This research is significant as it bridges a gap in generative modeling by adapting effective continuous optimization methods for discrete state spaces, which are crucial for advancements in machine learning applications like language models and statistical reasoning.
Key Takeaways
- Discrete Adjoint Matching (DAM) offers a new approach for optimizing discrete generative models.
- The method utilizes continuous-time Markov chains to enhance performance in discrete domains.
- DAM is derived from a statistical perspective, contrasting with traditional control-theoretic methods.
- The effectiveness of DAM is demonstrated through synthetic and mathematical reasoning tasks.
- This research opens new avenues for algorithmic development in adjoint-based estimators.
Statistics > Machine Learning arXiv:2602.07132 (stat) [Submitted on 6 Feb 2026 (v1), last revised 14 Feb 2026 (this version, v2)] Title:Discrete Adjoint Matching Authors:Oswin So, Brian Karrer, Chuchu Fan, Ricky T. Q. Chen, Guan-Horng Liu View a PDF of the paper titled Discrete Adjoint Matching, by Oswin So and 4 other authors View PDF Abstract:Computation methods for solving entropy-regularized reward optimization -- a class of problems widely used for fine-tuning generative models -- have advanced rapidly. Among those, Adjoint Matching (AM, Domingo-Enrich et al., 2025) has proven highly effective in continuous state spaces with differentiable rewards. Transferring these practical successes to discrete generative modeling, however, remains particularly challenging and largely unexplored, mainly due to the drastic shift in generative model classes to discrete state spaces, which are nowhere differentiable. In this work, we propose Discrete Adjoint Matching (DAM) -- a discrete variant of AM for fine-tuning discrete generative models characterized by Continuous-Time Markov Chains, such as diffusion-based large language models. The core of DAM is the introduction of discrete adjoint-an estimator of the optimal solution to the original problem but formulated on discrete domains-from which standard matching frameworks can be applied. This is derived via a purely statistical standpoint, in contrast to the control-theoretic viewpoint in AM, thereby opening up new algorithmic oppo...