[2603.07475] A Comparative analysis of Layer-wise Representational Capacity in AR and Diffusion LLMs

[2603.07475] A Comparative analysis of Layer-wise Representational Capacity in AR and Diffusion LLMs

arXiv - Machine Learning 3 min read

About this article

Abstract page for arXiv paper 2603.07475: A Comparative analysis of Layer-wise Representational Capacity in AR and Diffusion LLMs

Computer Science > Computation and Language arXiv:2603.07475 (cs) [Submitted on 8 Mar 2026 (v1), last revised 27 Apr 2026 (this version, v2)] Title:A Comparative analysis of Layer-wise Representational Capacity in AR and Diffusion LLMs Authors:Raghavv Goel, Risheek Garrepalli, Sudhanshu Agrawal, Chris Lott, Mingu Lee, Fatih Porikli View a PDF of the paper titled A Comparative analysis of Layer-wise Representational Capacity in AR and Diffusion LLMs, by Raghavv Goel and 5 other authors View PDF HTML (experimental) Abstract:Autoregressive (AR) language models build representations incrementally via left-to-right prediction, while diffusion language models (dLLMs) are trained through full-sequence denoising. Although recent dLLMs match AR performance, whether diffusion objectives fundamentally reshape internal representations remains unclear. We perform the first layer- and token-wise representational analysis comparing native dLLMs (LLaDA), native AR models (Qwen2.5), and AR-initialized dLLMs (Dream-7B), using cosine similarity across layers and tokens alongside static inference-time layer-skipping as an analytical probe of redundancy. We find that diffusion objectives produce more global representations with substantial early-layer redundancy and reduced recency bias, while AR objectives yield tightly coupled, locally structured representations. AR-initialized dLLMs retain AR-like dynamics despite diffusion training, revealing persistent initialization bias. Leveraging this...

Originally published on April 28, 2026. Curated by AI News.

Related Articles

Llms

I built a solo AI platform from Algeria with no funding, no team and no ad spend - here's what's inside it after 2 months

Hello, 20 years old here just got into the Ai platform and launched this last two weeks and here is what I have on it so far. - Latest Ai...

Reddit - Artificial Intelligence · 1 min ·
USF murder suspect accused of using ChatGPT to research cover-up, prosecutors say
Llms

USF murder suspect accused of using ChatGPT to research cover-up, prosecutors say

Days after the remains of one of the two missing University of South Florida doctoral students were found, prosecutors say the suspect ma...

AI Tools & Products · 3 min ·
Anthropic’s Claude AI deletes PocketOS production database
Llms

Anthropic’s Claude AI deletes PocketOS production database

Claude AI deleted PocketOS's production database, but the market for Claude 4.7 release by May 31 remains at 100% YES.

AI Tools & Products · 3 min ·
Claude-powered AI coding agent deletes entire company database in 9 seconds
Llms

Claude-powered AI coding agent deletes entire company database in 9 seconds

The founder of PocketOS has penned a social media post to warn others about the “systemic failures” of flagship AI and digital services p...

AI Tools & Products · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime