Machine Learning Ai Startups Ai Agents Data Science

[2510.26792] Learning Pseudorandom Numbers with Transformers: Permuted Congruential Generators, Curricula, and Interpretability

arXiv - Machine Learning February 18, 2026 4 min read Article

Summary

This article explores how Transformer models can learn sequences generated by Permuted Congruential Generators (PCGs), demonstrating their effectiveness in predicting complex patterns in pseudorandom number generation.

Why It Matters

Understanding how Transformers can learn and predict sequences from complex PRNGs like PCGs is crucial for advancements in machine learning applications, cryptography, and AI interpretability. This research highlights the potential of Transformers in areas requiring high-level pattern recognition and curriculum learning.

Key Takeaways

Transformers can effectively learn and predict sequences from complex Permuted Congruential Generators (PCGs).
The study reveals a scaling law indicating that the number of sequence elements needed for accurate predictions grows with the modulus size.
Curriculum learning is essential for optimizing the learning process when dealing with larger moduli in PRNGs.
Novel clustering phenomena in embedding layers suggest that representations can transfer across different scales of moduli.
The findings have implications for improving AI interpretability and enhancing cryptographic applications.

Computer Science > Machine Learning arXiv:2510.26792 (cs) [Submitted on 30 Oct 2025 (v1), last revised 16 Feb 2026 (this version, v2)] Title:Learning Pseudorandom Numbers with Transformers: Permuted Congruential Generators, Curricula, and Interpretability Authors:Tao Tao, Maissam Barkeshli View a PDF of the paper titled Learning Pseudorandom Numbers with Transformers: Permuted Congruential Generators, Curricula, and Interpretability, by Tao Tao and Maissam Barkeshli View PDF HTML (experimental) Abstract:We study the ability of Transformer models to learn sequences generated by Permuted Congruential Generators (PCGs), a widely used family of pseudo-random number generators (PRNGs). PCGs introduce substantial additional difficulty over linear congruential generators (LCGs) by applying a series of bit-wise shifts, XORs, rotations and truncations to the hidden state. We show that Transformers can nevertheless successfully perform in-context prediction on unseen sequences from diverse PCG variants, in tasks that are beyond published classical attacks. In our experiments we scale moduli up to $2^{22}$ using up to $50$ million model parameters and datasets with up to $5$ billion tokens. Surprisingly, we find even when the output is truncated to a single bit, it can be reliably predicted by the model. When multiple distinct PRNGs are presented together during training, the model can jointly learn them, identifying structures from different permutations. We demonstrate a scaling la...

Read Original Article

[2510.26792] Learning Pseudorandom Numbers with Transformers: Permuted Congruential Generators, Curricula, and Interpretability

Summary

Why It Matters

Key Takeaways

Related Articles

[D] Budget Machine Learning Hardware

UMKC Announces New Master of Science in Artificial Intelligence

Your prompts aren’t the problem — something else is

[R], 31 MILLIONS High frequency data, Light GBM worked perfectly

No comments

Stay updated with AI News