[2602.19109] Post-Routing Arithmetic in Llama-3: Last-Token Result Writing and Rotation-Structured Digit Directions
Summary
The paper examines three-digit addition in Meta-Llama-3-8B, focusing on how arithmetic results are determined post-routing, emphasizing the role of the last input token in the process.
Why It Matters
This research provides insights into the inner workings of Llama-3's arithmetic capabilities, revealing how token interactions influence outcomes. Understanding these mechanisms can enhance the development of more efficient AI models and improve their performance in numerical tasks.
Key Takeaways
- Post-routing arithmetic in Llama-3 shows a shift in how results are generated, relying heavily on the last input token.
- Causal residual patching and attention ablations highlight a significant boundary in processing layers.
- Digit direction dictionaries adapt based on context but maintain a relationship through low-rank alignment.
- Naive cross-context transfer fails, but learned rotation directions can restore edits effectively.
- This research can inform future AI model designs, particularly in arithmetic tasks.
Computer Science > Artificial Intelligence arXiv:2602.19109 (cs) [Submitted on 22 Feb 2026] Title:Post-Routing Arithmetic in Llama-3: Last-Token Result Writing and Rotation-Structured Digit Directions Authors:Yao Yan View a PDF of the paper titled Post-Routing Arithmetic in Llama-3: Last-Token Result Writing and Rotation-Structured Digit Directions, by Yao Yan View PDF HTML (experimental) Abstract:We study three-digit addition in Meta-Llama-3-8B (base) under a one-token readout to characterize how arithmetic answers are finalized after cross-token routing becomes causally irrelevant. Causal residual patching and cumulative attention ablations localize a sharp boundary near layer~17: beyond it, the decoded sum is controlled almost entirely by the last input token and late-layer self-attention is largely dispensable. In this post-routing regime, digit(-sum) direction dictionaries vary with a next-higher-digit context but are well-related by an approximately orthogonal map inside a shared low-rank subspace (low-rank Procrustes alignment). Causal digit editing matches this geometry: naive cross-context transfer fails, while rotating directions through the learned map restores strict counterfactual edits; negative controls do not recover. Subjects: Artificial Intelligence (cs.AI) Cite as: arXiv:2602.19109 [cs.AI] (or arXiv:2602.19109v1 [cs.AI] for this version) https://doi.org/10.48550/arXiv.2602.19109 Focus to learn more arXiv-issued DOI via DataCite (pending registration)...