Machine Learning Ai Agents Data Science

[2602.20141] Recurrent Structural Policy Gradient for Partially Observable Mean Field Games

arXiv - AI February 24, 2026 3 min read Article

Summary

This paper presents the Recurrent Structural Policy Gradient (RSPG) method for Partially Observable Mean Field Games (MFGs), achieving faster convergence and improved performance in macroeconomic models with heterogeneous agents.

Why It Matters

The development of RSPG addresses the limitations of existing methods in MFGs, particularly in partially observable settings. This advancement is crucial for modeling complex interactions in large populations, impacting fields like economics and AI-driven decision-making.

Key Takeaways

RSPG is the first history-aware method for Partially Observable MFGs.
The method achieves state-of-the-art performance and faster convergence.
Introduces MFAX, a JAX-based framework for MFGs.
Addresses challenges in modeling heterogeneous agents and common noise.
Enhances the understanding of large population dynamics in AI applications.

Computer Science > Artificial Intelligence arXiv:2602.20141 (cs) [Submitted on 23 Feb 2026] Title:Recurrent Structural Policy Gradient for Partially Observable Mean Field Games Authors:Clarisse Wibault, Johannes Forkel, Sebastian Towers, Tiphaine Wibault, Juan Duque, George Whittle, Andreas Schaab, Yucheng Yang, Chiyuan Wang, Michael Osborne, Benjamin Moll, Jakob Foerster View a PDF of the paper titled Recurrent Structural Policy Gradient for Partially Observable Mean Field Games, by Clarisse Wibault and 11 other authors View PDF HTML (experimental) Abstract:Mean Field Games (MFGs) provide a principled framework for modeling interactions in large population models: at scale, population dynamics become deterministic, with uncertainty entering only through aggregate shocks, or common noise. However, algorithmic progress has been limited since model-free methods are too high variance and exact methods scale poorly. Recent Hybrid Structural Methods (HSMs) use Monte Carlo rollouts for the common noise in combination with exact estimation of the expected return, conditioned on those samples. However, HSMs have not been scaled to Partially Observable settings. We propose Recurrent Structural Policy Gradient (RSPG), the first history-aware HSM for settings involving public information. We also introduce MFAX, our JAX-based framework for MFGs. By leveraging known transition dynamics, RSPG achieves state-of-the-art performance as well as an order-of-magnitude faster convergence and ...

Read Original Article

Machine Learning

[R] ICML Anonymized git repos for rebuttal

A number of the papers I'm reviewing for have submitted additional figures and code through anonymized git repos (e.g. https://anonymous....

Reddit - Machine Learning · 1 min · 22 minutes ago

Llms

[R] Reference model free behavioral discovery of AudiBench model organisms via Probe-Mediated Adaptive Auditing

Anthropic's AuditBench - 56 Llama 3.3 70B models with planted hidden behaviors - their best agent detects the behaviros 10-13% of the tim...

Reddit - Machine Learning · 1 min · 22 minutes ago

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · 39 minutes ago

Llms

[P] Dante-2B: I'm training a 2.1B bilingual fully open Italian/English LLM from scratch on 2×H200. Phase 1 done — here's what I've built.

The problem If you work with Italian text and local models, you know the pain. Every open-source LLM out there treats Italian as an after...

Reddit - Machine Learning · 1 min · about 2 hours ago