[2602.22600] Transformers converge to invariant algorithmic cores

[2602.22600] Transformers converge to invariant algorithmic cores

arXiv - AI 3 min read Article

Summary

The paper explores how transformers, despite varying weights, converge to invariant algorithmic cores essential for task performance, revealing insights into their internal workings.

Why It Matters

Understanding the internal structures of large language models like transformers is crucial for advancing AI interpretability and improving model design. This research highlights the consistent algorithmic cores that persist across different training runs, offering a pathway to better mechanistic interpretability.

Key Takeaways

  • Transformers exhibit invariant algorithmic cores despite different weight configurations.
  • Identifying these cores can enhance mechanistic interpretability of AI models.
  • The study reveals low-dimensional invariants that persist across training runs and scales.

Computer Science > Machine Learning arXiv:2602.22600 (cs) [Submitted on 26 Feb 2026] Title:Transformers converge to invariant algorithmic cores Authors:Joshua S. Schiffman View a PDF of the paper titled Transformers converge to invariant algorithmic cores, by Joshua S. Schiffman View PDF HTML (experimental) Abstract:Large language models exhibit sophisticated capabilities, yet understanding how they work internally remains a central challenge. A fundamental obstacle is that training selects for behavior, not circuitry, so many weight configurations can implement the same function. Which internal structures reflect the computation, and which are accidents of a particular training run? This work extracts algorithmic cores: compact subspaces necessary and sufficient for task performance. Independently trained transformers learn different weights but converge to the same cores. Markov-chain transformers embed 3D cores in nearly orthogonal subspaces yet recover identical transition spectra. Modular-addition transformers discover compact cyclic operators at grokking that later inflate, yielding a predictive model of the memorization-to-generalization transition. GPT-2 language models govern subject-verb agreement through a single axis that, when flipped, inverts grammatical number throughout generation across scales. These results reveal low-dimensional invariants that persist across training runs and scales, suggesting that transformer computations are organized around compact,...

Related Articles

Llms

What I learned about multi-agent coordination running 9 specialized Claude agents

I've been experimenting with multi-agent AI systems and ended up building something more ambitious than I originally planned: a fully ope...

Reddit - Artificial Intelligence · 1 min ·
Llms

[D] The problem with comparing AI memory system benchmarks — different evaluation methods make scores meaningless

I've been reviewing how various AI memory systems evaluate their performance and noticed a fundamental issue with cross-system comparison...

Reddit - Machine Learning · 1 min ·
Shifting to AI model customization is an architectural imperative | MIT Technology Review
Llms

Shifting to AI model customization is an architectural imperative | MIT Technology Review

In the early days of large language models (LLMs), we grew accustomed to massive 10x jumps in reasoning and coding capability with every ...

MIT Technology Review · 6 min ·
Llms

Artificial intelligence will always depends on human otherwise it will be obsolete.

I was looking for a tool for my specific need. There was not any. So i started to write the program in python, just basic structure. Then...

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime