[2602.14814] Learning State-Tracking from Code Using Linear RNNs

[2602.14814] Learning State-Tracking from Code Using Linear RNNs

arXiv - Machine Learning 3 min read Article

Summary

This paper explores state-tracking in machine learning, specifically using linear RNNs for permutation composition tasks, highlighting their advantages over Transformers.

Why It Matters

Understanding how different neural network architectures perform in state-tracking tasks is crucial for advancing machine learning models. This research provides insights into the limitations of Transformers and the potential of linear RNNs, which could influence future model design and applications in AI.

Key Takeaways

  • Linear RNNs excel in state-tracking tasks compared to Transformers.
  • Permutation composition tasks reveal limitations in sequence models.
  • State-tracking is complicated by non-observable actions.
  • The study frames state-tracking as a probabilistic finite-state automaton.
  • Linear RNNs can perform worse than non-linear RNNs in certain setups.

Computer Science > Machine Learning arXiv:2602.14814 (cs) [Submitted on 16 Feb 2026] Title:Learning State-Tracking from Code Using Linear RNNs Authors:Julien Siems, Riccardo Grazzi, Kirill Kalinin, Hitesh Ballani, Babak Rahmani View a PDF of the paper titled Learning State-Tracking from Code Using Linear RNNs, by Julien Siems and 4 other authors View PDF Abstract:Over the last years, state-tracking tasks, particularly permutation composition, have become a testbed to understand the limits of sequence models architectures like Transformers and RNNs (linear and non-linear). However, these are often sequence-to-sequence tasks: learning to map actions (permutations) to states, which is incompatible with the next-token prediction setting commonly used to train language models. We address this gap by converting permutation composition into code via REPL traces that interleave state-reveals through prints and variable transformations. We show that linear RNNs capable of state-tracking excel also in this setting, while Transformers still fail. Motivated by this representation, we investigate why tracking states in code is generally difficult: actions are not always fully observable. We frame this as tracking the state of a probabilistic finite-state automaton with deterministic state reveals and show that linear RNNs can be worse than non-linear RNNs at tracking states in this setup. Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL) Cite as: arXiv:2602.14814 [cs...

Related Articles

Llms

Curated 550+ free AI tools useful for building projects (LLMs, APIs, local models, RAG, agents)

Over the last few days I was collecting free or low cost AI tools that are actually useful if you want to build stuff, not just try rando...

Reddit - Artificial Intelligence · 1 min ·
Claude Mythos and misguided open-weight fearmongering
Llms

Claude Mythos and misguided open-weight fearmongering

AI Tools & Products · 9 min ·
Llms

Anthropic Agrees to Rent CoreWeave AI Capacity to Power Claude

AI Tools & Products · 1 min ·
CoreWeave strikes a deal to power Anthropic's Claude AI models — and the stock surges 12%
Llms

CoreWeave strikes a deal to power Anthropic's Claude AI models — and the stock surges 12%

AI Tools & Products · 3 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime