[2602.20141] Recurrent Structural Policy Gradient for Partially Observable Mean Field Games
Summary
This paper presents the Recurrent Structural Policy Gradient (RSPG) method for Partially Observable Mean Field Games (MFGs), achieving faster convergence and improved performance in macroeconomic models with heterogeneous agents.
Why It Matters
The development of RSPG addresses the limitations of existing methods in MFGs, particularly in partially observable settings. This advancement is crucial for modeling complex interactions in large populations, impacting fields like economics and AI-driven decision-making.
Key Takeaways
- RSPG is the first history-aware method for Partially Observable MFGs.
- The method achieves state-of-the-art performance and faster convergence.
- Introduces MFAX, a JAX-based framework for MFGs.
- Addresses challenges in modeling heterogeneous agents and common noise.
- Enhances the understanding of large population dynamics in AI applications.
Computer Science > Artificial Intelligence arXiv:2602.20141 (cs) [Submitted on 23 Feb 2026] Title:Recurrent Structural Policy Gradient for Partially Observable Mean Field Games Authors:Clarisse Wibault, Johannes Forkel, Sebastian Towers, Tiphaine Wibault, Juan Duque, George Whittle, Andreas Schaab, Yucheng Yang, Chiyuan Wang, Michael Osborne, Benjamin Moll, Jakob Foerster View a PDF of the paper titled Recurrent Structural Policy Gradient for Partially Observable Mean Field Games, by Clarisse Wibault and 11 other authors View PDF HTML (experimental) Abstract:Mean Field Games (MFGs) provide a principled framework for modeling interactions in large population models: at scale, population dynamics become deterministic, with uncertainty entering only through aggregate shocks, or common noise. However, algorithmic progress has been limited since model-free methods are too high variance and exact methods scale poorly. Recent Hybrid Structural Methods (HSMs) use Monte Carlo rollouts for the common noise in combination with exact estimation of the expected return, conditioned on those samples. However, HSMs have not been scaled to Partially Observable settings. We propose Recurrent Structural Policy Gradient (RSPG), the first history-aware HSM for settings involving public information. We also introduce MFAX, our JAX-based framework for MFGs. By leveraging known transition dynamics, RSPG achieves state-of-the-art performance as well as an order-of-magnitude faster convergence and ...