[2509.16606] Bayesian Ego-graph Inference for Networked Multi-Agent Reinforcement Learning
Summary
The paper presents BayesG, a decentralized actor-framework for networked multi-agent reinforcement learning (MARL) that enhances adaptability in dynamic environments by using Bayesian ego-graph inference.
Why It Matters
This research addresses the limitations of existing MARL methods that rely on static neighborhoods and centralized systems. By proposing a decentralized approach that learns interaction structures through Bayesian inference, it offers a scalable solution for real-world applications, such as traffic control, where adaptability and efficiency are crucial.
Key Takeaways
- BayesG allows agents to sample subgraphs for decision-making, enhancing adaptability.
- The framework outperforms traditional MARL methods in large-scale tasks.
- It utilizes Bayesian variational inference to learn interaction topologies.
- The approach is suitable for decentralized systems with limited communication.
- BayesG demonstrates improved scalability and efficiency in dynamic environments.
Computer Science > Multiagent Systems arXiv:2509.16606 (cs) [Submitted on 20 Sep 2025 (v1), last revised 13 Feb 2026 (this version, v4)] Title:Bayesian Ego-graph Inference for Networked Multi-Agent Reinforcement Learning Authors:Wei Duan, Jie Lu, Junyu Xuan View a PDF of the paper titled Bayesian Ego-graph Inference for Networked Multi-Agent Reinforcement Learning, by Wei Duan and 2 other authors View PDF HTML (experimental) Abstract:In networked multi-agent reinforcement learning (Networked-MARL), decentralized agents must act under local observability and constrained communication over fixed physical graphs. Existing methods often assume static neighborhoods, limiting adaptability to dynamic or heterogeneous environments. While centralized frameworks can learn dynamic graphs, their reliance on global state access and centralized infrastructure is impractical in real-world decentralized systems. We propose a stochastic graph-based policy for Networked-MARL, where each agent conditions its decision on a sampled subgraph over its local physical neighborhood. Building on this formulation, we introduce BayesG, a decentralized actor-framework that learns sparse, context-aware interaction structures via Bayesian variational inference. Each agent operates over an ego-graph and samples a latent communication mask to guide message passing and policy computation. The variational distribution is trained end-to-end alongside the policy using an evidence lower bound (ELBO) objective, ...