[2604.05469] Task Ecologies and the Evolution of World-Tracking Representations in Large Language Models
About this article
Abstract page for arXiv paper 2604.05469: Task Ecologies and the Evolution of World-Tracking Representations in Large Language Models
Statistics > Methodology arXiv:2604.05469 (stat) [Submitted on 7 Apr 2026] Title:Task Ecologies and the Evolution of World-Tracking Representations in Large Language Models Authors:Giulio Valentino Dalla Riva View a PDF of the paper titled Task Ecologies and the Evolution of World-Tracking Representations in Large Language Models, by Giulio Valentino Dalla Riva View PDF HTML (experimental) Abstract:We study language models as evolving model organisms and ask when autoregressive next-token learning selects for world-tracking representations. For any encoding of latent world states, the Bayes-optimal next-token cross-entropy decomposes into the irreducible conditional entropy plus a Jensen--Shannon excess term. That excess vanishes if and only if the encoding preserves the training ecology's equivalence classes. This yields a precise notion of ecological veridicality for language models and identifies the minimum-complexity zero-excess solution as the quotient partition by training equivalence. We then determine when this fixed-encoding analysis applies to transformer families: frozen dense and frozen Mixture-of-Experts transformers satisfy it, in-context learning does not enlarge the model's separation set, and per-task adaptation breaks the premise. The framework predicts two characteristic failure modes: simplicity pressure preferentially removes low-gain distinctions, and training-optimal models can still incur positive excess on deployment ecologies that refine the trai...