[2601.02799] Stratified Hazard Sampling: Minimal-Variance Event Scheduling for CTMC/DTMC Discrete Diffusion and Flow Models

[2601.02799] Stratified Hazard Sampling: Minimal-Variance Event Scheduling for CTMC/DTMC Discrete Diffusion and Flow Models

arXiv - Machine Learning 4 min read Article

Summary

This article presents Stratified Hazard Sampling (SHS), a novel method for improving event scheduling in discrete diffusion and flow models, enhancing sample quality and robustness in machine learning applications.

Why It Matters

The proposed SHS method addresses significant challenges in current sampling techniques for discrete diffusion models, particularly issues related to under-editing and over-editing. By minimizing variance in event scheduling, SHS can lead to more accurate and reliable outcomes in machine learning tasks, making it a valuable contribution to the field.

Key Takeaways

  • SHS offers a training-free and hyperparameter-free approach to improve sampling in discrete diffusion models.
  • The method reduces variance in event scheduling, enhancing the quality of generated samples.
  • SHS maintains multimodality while ensuring robustness against token-level constraints.
  • The technique is applicable to various models that utilize stay-vs.-replace decompositions.
  • Experimental results demonstrate consistent improvements in sample quality with SHS.

Computer Science > Machine Learning arXiv:2601.02799 (cs) [Submitted on 6 Jan 2026 (v1), last revised 17 Feb 2026 (this version, v2)] Title:Stratified Hazard Sampling: Minimal-Variance Event Scheduling for CTMC/DTMC Discrete Diffusion and Flow Models Authors:Seunghwan Jang, SooJean Han View a PDF of the paper titled Stratified Hazard Sampling: Minimal-Variance Event Scheduling for CTMC/DTMC Discrete Diffusion and Flow Models, by Seunghwan Jang and 1 other authors View PDF HTML (experimental) Abstract:Uniform-noise discrete diffusion and flow models (e.g., D3PM, SEDD, UDLM, DFM) generate sequences non-autoregressively by iteratively refining randomly initialized vocabulary tokens through multiple context-dependent replacements. These models are typically formulated as time-inhomogeneous CTMC/DTMC processes and sampled using independent Bernoulli change decisions at each discretization step. This induces Poisson-binomial variance in per-position jump counts that grows with the number of required edits, leading to the characteristic under-editing (residual noise) and over-editing (cascading substitutions) failure modes that degrade sample quality, especially under tight discretization budgets. In contrast, absorbing-state (mask-start) models avoid this instability by allowing each position to jump at most once. We propose Stratified Hazard Sampling (SHS), a training-free, drop-in, and hyperparameter-free inference principle for any sampler that admits a stay-vs.-replace decom...

Related Articles

Google quietly releases an offline-first AI dictation app on iOS | TechCrunch
Machine Learning

Google quietly releases an offline-first AI dictation app on iOS | TechCrunch

Google's new offline-first dictation app uses Gemma AI models to take on the apps like Wispr Flow.

TechCrunch - AI · 4 min ·
Machine Learning

How well do you understand how AI/deep learning works?

Specifically, how AI are programmed, trained, and how they perform their functions. I’ll be asking this in different subs to see if/how t...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

a fun survey to look at how consumers perceive the use of AI in fashion brand marketing. (all ages, all genders)

Hi r/artificial ! I'm posting on behalf of a friend who is conducting academic research for their dissertation. The survey looks at how c...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

I Built a Functional Cognitive Engine

Aura: https://github.com/youngbryan97/aura Aura is not a chatbot with personality prompts. It is a complete cognitive architecture — 60+ ...

Reddit - Artificial Intelligence · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime