[2603.00270] Transformers Remember First, Forget Last: Dual-Process Interference in LLMs
About this article
Abstract page for arXiv paper 2603.00270: Transformers Remember First, Forget Last: Dual-Process Interference in LLMs
Computer Science > Information Retrieval arXiv:2603.00270 (cs) [Submitted on 27 Feb 2026] Title:Transformers Remember First, Forget Last: Dual-Process Interference in LLMs Authors:Sourav Chattaraj, Kanak Raj View a PDF of the paper titled Transformers Remember First, Forget Last: Dual-Process Interference in LLMs, by Sourav Chattaraj and Kanak Raj View PDF HTML (experimental) Abstract:When large language models encounter conflicting information in context, which memories survive -- early or recent? We adapt classical interference paradigms from cognitive psychology to answer this question, testing 39 LLMs across diverse architectures and scales. Every model shows the same pattern: proactive interference (PI) dominates retroactive interference (RI) universally (Cohen's d = 1.73, p < 0.0001), meaning early encodings are protected at the cost of recent information -- the opposite of human memory, where RI typically dominates. Three findings indicate that RI and PI reflect separate memory mechanisms. RI and PI are uncorrelated (R^2 = 0.044), rejecting a unified "memory capacity." Model size predicts RI resistance (R^2 = 0.49) but not PI (R^2 = 0.06, n.s.) -- only RI is capacity-dependent. And error analysis reveals distinct failure modes: RI failures are passive retrieval failures (51%), while PI failures show active primacy intrusion (56%); both show <1% hallucination. These patterns parallel the consolidation-retrieval distinction in cognitive science, suggesting that transf...