[2505.09855] An evolutionary perspective on modes of learning in Transformers
About this article
Abstract page for arXiv paper 2505.09855: An evolutionary perspective on modes of learning in Transformers
Computer Science > Machine Learning arXiv:2505.09855 (cs) [Submitted on 14 May 2025 (v1), last revised 21 Mar 2026 (this version, v2)] Title:An evolutionary perspective on modes of learning in Transformers Authors:Alexander Y. Ku, Thomas L. Griffiths, Stephanie C.Y. Chan View a PDF of the paper titled An evolutionary perspective on modes of learning in Transformers, by Alexander Y. Ku and 2 other authors View PDF HTML (experimental) Abstract:The success of Transformers lies in their ability to improve inference through two complementary strategies: the permanent refinement of model parameters via in-weight learning (IWL), and the ephemeral modulation of inferences via in-context learning (ICL), which leverages contextual information maintained in the model's activations. Evolutionary biology tells us that the predictability of the environment across timescales predicts the extent to which analogous strategies should be preferred. Genetic evolution adapts to stable environmental features by gradually modifying the genotype over generations. Conversely, environmental volatility favors plasticity, which enables a single genotype to express different traits within a lifetime, provided there are reliable cues to guide the adaptation. We operationalize these dimensions (environmental stability and cue reliability) in controlled task settings (sinusoid regression and Omniglot classification) to characterize their influence on learning in Transformers. We find that stable environm...