[2511.16652] Evolution Strategies at the Hyperscale

arXiv - AI February 17, 2026 4 min read Article

Summary

The paper presents EGGROLL, an enhanced Evolution Strategy for optimizing large-scale models, achieving significant speed improvements and theoretical insights into convergence in high dimensions.

Why It Matters

As machine learning models grow in complexity and size, efficient optimization methods become crucial. EGGROLL addresses the limitations of traditional Evolution Strategies, offering a scalable solution that maintains performance while significantly increasing training speed, which is vital for advancing AI capabilities.

Key Takeaways

EGGROLL improves the efficiency of Evolution Strategies for large models.
Achieves up to 91% throughput of pure batch inference with structured perturbations.
Provides theoretical insights into the convergence of Evolution Strategies in high dimensions.
Demonstrates competitive performance on reasoning tasks with nonlinear recurrent models.
Maintains performance in reinforcement learning settings despite faster training.

Computer Science > Machine Learning arXiv:2511.16652 (cs) [Submitted on 20 Nov 2025 (v1), last revised 16 Feb 2026 (this version, v2)] Title:Evolution Strategies at the Hyperscale Authors:Bidipta Sarkar, Mattie Fellows, Juan Agustin Duque, Alistair Letcher, Antonio León Villares, Anya Sims, Clarisse Wibault, Dmitry Samsonov, Dylan Cope, Jarek Liesen, Kang Li, Lukas Seier, Theo Wolf, Uljad Berdica, Valentin Mohl, Alexander David Goldie, Aaron Courville, Karin Sevegnani, Shimon Whiteson, Jakob Nicolaus Foerster View a PDF of the paper titled Evolution Strategies at the Hyperscale, by Bidipta Sarkar and 19 other authors View PDF Abstract:Evolution Strategies (ES) is a class of powerful black-box optimisation methods that are highly parallelisable and can handle non-differentiable and noisy objectives. However, naïve ES becomes prohibitively expensive at scale on GPUs due to the low arithmetic intensity of batched matrix multiplications with unstructured random perturbations. We introduce Evolution Guided GeneRal Optimisation via Low-rank Learning (EGGROLL), which improves arithmetic intensity by structuring individual perturbations as rank-$r$ matrices, resulting in a hundredfold increase in training speed for billion-parameter models at large population sizes, achieving up to 91% of the throughput of pure batch inference. We provide a rigorous theoretical analysis of Gaussian ES for high-dimensional parameter objectives, investigating conditions needed for ES updates to conv...

Read Original Article

[2511.16652] Evolution Strategies at the Hyperscale

Summary

Why It Matters

Key Takeaways

Related Articles

Seeking Critique on Research Approach to Open Set Recognition (Novelty Detection) [R]

What if attention didn’t need matrix multiplication?

WTF. Its real. AllBirds (the shoe company) is pivoting to inference.

Allbirds Is Pivoting to AI Compute. Sure, Why Not | WIRED

No comments

Stay updated with AI News