[2512.13586] ReFusion: A Diffusion Large Language Model with Parallel

[2512.13586] ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

arXiv - Machine Learning March 06, 2026 4 min read

About this article

Abstract page for arXiv paper 2512.13586: ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

Computer Science > Computation and Language arXiv:2512.13586 (cs) [Submitted on 15 Dec 2025 (v1), last revised 5 Mar 2026 (this version, v2)] Title:ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding Authors:Jia-Nan Li, Jian Guan, Wei Wu, Chongxuan Li View a PDF of the paper titled ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding, by Jia-Nan Li and 3 other authors View PDF Abstract:Autoregressive models (ARMs) are hindered by slow sequential inference. While masked diffusion models (MDMs) offer a parallel alternative, they suffer from critical drawbacks: high computational overhead from precluding Key-Value (KV) caching, and incoherent generation arising from learning dependencies over an intractable space of token combinations. To address these limitations, we introduce \textsc{ReFusion}, a novel masked diffusion model that integrates sequence reorganization into the causal attention framework. By elevating parallel decoding from the token level to a higher slot level, \textsc{ReFusion} interleaves inter-slot diffusion-based selection with intra-slot autoregressive infilling, while reordering newly generated slots ahead of the remaining masks after each iteration. Consequently, this design simultaneously unlocks full KV cache reuse and reduces learning complexity from an intractable token combination space to a manageable slot-level permutation space. Extensive experiments on seven diverse benchmarks show that \...

Originally published on March 06, 2026. Curated by AI News.

Llms

ChatGPT Critiques My Approach to AI

I uploaded VulcanAMI into ChatGPT and had it to a deep analysis. I then asked one simple question: What would be the result of wider adop...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

HALO - Hierarchical Autonomous Learning Organism

The idea is called HALO - Hierarchical Autonomous Learning Organism. The core premise is simple: what if instead of just making LLMs bigg...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

Llms

[Project] PentaNet: Pushing beyond BitNet with Native Pentanary {-2, -1, 0, 1, 2} Quantization (124M, zero-multiplier inference)

Hey everyone, I've been experimenting with extreme LLM quantization following the BitNet 1.58b paper. While ternary quantization {-1, 0, ...

Reddit - Machine Learning · 1 min · about 5 hours ago

Llms

[R] Controlled experiment: giving an LLM agent access to CS papers during automated hyperparameter search improves results by 3.2%

Ran a controlled experiment measuring whether LLM coding agents benefit from access to research literature during automated experimentati...

Reddit - Machine Learning · 1 min · about 5 hours ago

[2512.13586] ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding

About this article

Related Articles

ChatGPT Critiques My Approach to AI

HALO - Hierarchical Autonomous Learning Organism

[Project] PentaNet: Pushing beyond BitNet with Native Pentanary {-2, -1, 0, 1, 2} Quantization (124M, zero-multiplier inference)

[R] Controlled experiment: giving an LLM agent access to CS papers during automated hyperparameter search improves results by 3.2%

No comments

Stay updated with AI News