[2602.00388] Safer by Diffusion, Broken by Context: Diffusion LLM's

[2602.00388] Safer by Diffusion, Broken by Context: Diffusion LLM's Safety Blessing and Its Failure Mode

arXiv - Machine Learning April 03, 2026 4 min read

About this article

Abstract page for arXiv paper 2602.00388: Safer by Diffusion, Broken by Context: Diffusion LLM's Safety Blessing and Its Failure Mode

Computer Science > Machine Learning arXiv:2602.00388 (cs) [Submitted on 30 Jan 2026 (v1), last revised 2 Apr 2026 (this version, v2)] Title:Safer by Diffusion, Broken by Context: Diffusion LLM's Safety Blessing and Its Failure Mode Authors:Zeyuan He, Yupeng Chen, Lang Lin, Yihan Wang, Shenxu Chang, Eric Sommerlade, Philip Torr, Junchi Yu, Adel Bibi, Jialin Yu View a PDF of the paper titled Safer by Diffusion, Broken by Context: Diffusion LLM's Safety Blessing and Its Failure Mode, by Zeyuan He and 9 other authors View PDF HTML (experimental) Abstract:Diffusion large language models (D-LLMs) offer an alternative to autoregressive LLMs (AR-LLMs) and have demonstrated advantages in generation efficiency. Beyond the utility benefits, we argue that D-LLMs exhibit a previously underexplored safety blessing: their diffusion-style generation confers intrinsic robustness against jailbreak attacks originally designed for AR-LLMs. In this work, we provide an initial analysis of the underlying mechanism, showing that the diffusion trajectory induces a stepwise reduction effect that progressively suppresses unsafe generations. This robustness, however, is not absolute. Following this analysis, we highlight a simple yet effective failure mode, context nesting, in which harmful requests are embedded within structured benign contexts. Empirically, we show that this simple black-box strategy bypasses D-LLMs' safety blessing, achieving state-of-the-art attack success rates across models and...

Originally published on April 03, 2026. Curated by AI News.

Llms

What is AI, how do apps like ChatGPT work and why are there concerns?

AI is transforming modern life, but some critics worry about its potential misuse and environmental impact.

AI News - General · 7 min · about 2 hours ago

Llms

[2603.29957] Think Anywhere in Code Generation

Abstract page for arXiv paper 2603.29957: Think Anywhere in Code Generation

arXiv - Machine Learning · 3 min · about 5 hours ago

Llms

[2603.16880] NeuroNarrator: A Generalist EEG-to-Text Foundation Model for Clinical Interpretation via Spectro-Spatial Grounding and Temporal State-Space Reasoning

Abstract page for arXiv paper 2603.16880: NeuroNarrator: A Generalist EEG-to-Text Foundation Model for Clinical Interpretation via Spectr...

arXiv - Machine Learning · 4 min · about 5 hours ago

Llms

[2512.21106] Semantic Refinement with LLMs for Graph Representations

Abstract page for arXiv paper 2512.21106: Semantic Refinement with LLMs for Graph Representations

arXiv - Machine Learning · 4 min · about 5 hours ago

[2602.00388] Safer by Diffusion, Broken by Context: Diffusion LLM's Safety Blessing and Its Failure Mode

About this article

Related Articles

What is AI, how do apps like ChatGPT work and why are there concerns?

[2603.29957] Think Anywhere in Code Generation

[2603.16880] NeuroNarrator: A Generalist EEG-to-Text Foundation Model for Clinical Interpretation via Spectro-Spatial Grounding and Temporal State-Space Reasoning

[2512.21106] Semantic Refinement with LLMs for Graph Representations

No comments

Stay updated with AI News