[2604.09035] Advantage-Guided Diffusion for Model-Based Reinforcement Learning
About this article
Abstract page for arXiv paper 2604.09035: Advantage-Guided Diffusion for Model-Based Reinforcement Learning
Computer Science > Artificial Intelligence arXiv:2604.09035 (cs) [Submitted on 10 Apr 2026] Title:Advantage-Guided Diffusion for Model-Based Reinforcement Learning Authors:Daniele Foffano, Arvid Eriksson, David Broman, Karl H. Johansson, Alexandre Proutiere View a PDF of the paper titled Advantage-Guided Diffusion for Model-Based Reinforcement Learning, by Daniele Foffano and 4 other authors View PDF HTML (experimental) Abstract:Model-based reinforcement learning (MBRL) with autoregressive world models suffers from compounding errors, whereas diffusion world models mitigate this by generating trajectory segments jointly. However, existing diffusion guides are either policy-only, discarding value information, or reward-based, which becomes myopic when the diffusion horizon is short. We introduce Advantage-Guided Diffusion for MBRL (AGD-MBRL), which steers the reverse diffusion process using the agent's advantage estimates so that sampling concentrates on trajectories expected to yield higher long-term return beyond the generated window. We develop two guides: (i) Sigmoid Advantage Guidance (SAG) and (ii) Exponential Advantage Guidance (EAG). We prove that a diffusion model guided through SAG or EAG allows us to perform reweighted sampling of trajectories with weights increasing in state-action advantage-implying policy improvement under standard assumptions. Additionally, we show that the trajectories generated from AGD-MBRL follow an improved policy (that is, with higher v...