[2601.20198] DeRaDiff: Denoising Time Realignment of Diffusion Models

[2601.20198] DeRaDiff: Denoising Time Realignment of Diffusion Models

arXiv - Machine Learning 4 min read Article

Summary

The paper presents DeRaDiff, a novel method for denoising time realignment in diffusion models, enabling efficient adjustment of regularization strength without additional training.

Why It Matters

DeRaDiff addresses a significant challenge in machine learning related to the alignment of diffusion models with human preferences. By allowing for dynamic modulation of regularization strength during sampling, it reduces computational costs and enhances model performance, making it a valuable contribution to the field of generative AI.

Key Takeaways

  • DeRaDiff allows for efficient adjustment of regularization strength in diffusion models.
  • The method eliminates the need for expensive alignment sweeps, reducing computational costs.
  • It extends decoding-time realignment techniques from language models to diffusion models.
  • DeRaDiff provides a strong approximation for models aligned from scratch at different regularization strengths.
  • The approach enhances aesthetic appeal and mitigates artifacts in generated outputs.

Computer Science > Machine Learning arXiv:2601.20198 (cs) [Submitted on 28 Jan 2026 (v1), last revised 20 Feb 2026 (this version, v2)] Title:DeRaDiff: Denoising Time Realignment of Diffusion Models Authors:Ratnavibusena Don Shahain Manujith, Teoh Tze Tzun, Kenji Kawaguchi, Yang Zhang View a PDF of the paper titled DeRaDiff: Denoising Time Realignment of Diffusion Models, by Ratnavibusena Don Shahain Manujith and 3 other authors View PDF HTML (experimental) Abstract:Recent advances align diffusion models with human preferences to increase aesthetic appeal and mitigate artifacts and biases. Such methods aim to maximize a conditional output distribution aligned with higher rewards whilst not drifting far from a pretrained prior. This is commonly enforced by KL (Kullback Leibler) regularization. As such, a central issue still remains: how does one choose the right regularization strength? Too high of a strength leads to limited alignment and too low of a strength leads to "reward hacking". This renders the task of choosing the correct regularization strength highly non-trivial. Existing approaches sweep over this hyperparameter by aligning a pretrained model at multiple regularization strengths and then choose the best strength. Unfortunately, this is prohibitively expensive. We introduce DeRaDiff, a denoising time realignment procedure that, after aligning a pretrained model once, modulates the regularization strength during sampling to emulate models trained at other regular...

Related Articles

Machine Learning

[P] MCGrad: fix calibration of your ML model in subgroups

Hi r/MachineLearning, We’re open-sourcing MCGrad, a Python package for multicalibration–developed and deployed in production at Meta. Thi...

Reddit - Machine Learning · 1 min ·
Machine Learning

Ml project user give dataset and I give best model [D] [P]

Tl,dr : suggest me a solution to create a ai ml project where user will give his dataset as input and the project should give best model ...

Reddit - Machine Learning · 1 min ·
Machine Learning

[D] ICML Reviewer Acknowledgement

Hi, I'm a little confused about ICML discussion period Does the period for reviewer acknowledging responses have already ended? One of th...

Reddit - Machine Learning · 1 min ·
Llms

Claude Opus 4.6 API at 40% below Anthropic pricing – try free before you pay anything

Hey everyone I've set up a self-hosted API gateway using [New-API](QuantumNous/new-ap) to manage and distribute Claude Opus 4.6 access ac...

Reddit - Artificial Intelligence · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime