[2506.17871] LLM Probability Concentration: How Alignment Shrinks the Generative Horizon
About this article
Abstract page for arXiv paper 2506.17871: LLM Probability Concentration: How Alignment Shrinks the Generative Horizon
Computer Science > Computation and Language arXiv:2506.17871 (cs) [Submitted on 22 Jun 2025 (v1), last revised 2 Mar 2026 (this version, v3)] Title:LLM Probability Concentration: How Alignment Shrinks the Generative Horizon Authors:Chenghao Yang, Sida Li, Ari Holtzman View a PDF of the paper titled LLM Probability Concentration: How Alignment Shrinks the Generative Horizon, by Chenghao Yang and 2 other authors View PDF Abstract:Despite their impressive capabilities, aligned large language models (LLMs) often generate outputs that lack diversity. What drives this consistency in the generation? We investigate this phenomenon through the lens of probability concentration in the model's output distribution. To quantify this concentration, we introduce the *Branching Factor* (BF) -- a token-invariant measure of the effective number of plausible next steps during generation. Our empirical analysis reveals two key findings: (1) BF often decreases as generation progresses, suggesting that LLMs become more predictable as they generate. (2) alignment tuning substantially sharpens the model's output distribution from the outset, reducing BF by a factor of 2-5 overall, and up to an order of magnitude (e.g., from 12 to 1.2) at the beginning positions. This stark reduction helps explain why aligned models often appear less sensitive to decoding strategies. Building on this insight, we find this consistency has surprising implications for complex reasoning. Aligned Chain-of-Thought (CoT)...