[2602.13309] Adaptive Value Decomposition: Coordinating a Varying Number of Agents in Urban Systems
Summary
The paper presents Adaptive Value Decomposition (AVD), a framework for coordinating multi-agent systems in urban environments, addressing challenges of varying agent numbers and action durations.
Why It Matters
This research is significant as it tackles the limitations of traditional multi-agent reinforcement learning (MARL) methods, which often assume a fixed number of agents. By adapting to dynamic urban systems, AVD enhances coordination and efficiency in real-world applications like bike-sharing, making it relevant for urban planning and intelligent transportation systems.
Key Takeaways
- AVD adapts to changing agent populations, improving coordination.
- The framework mitigates action homogenization through behavioral diversity.
- It is designed for semi-MARL settings, accommodating asynchronous decision-making.
- Experiments show AVD outperforms existing methods in real-world scenarios.
- The approach has implications for urban systems and intelligent transportation.
Computer Science > Multiagent Systems arXiv:2602.13309 (cs) [Submitted on 10 Feb 2026] Title:Adaptive Value Decomposition: Coordinating a Varying Number of Agents in Urban Systems Authors:Yexin Li, Jinjin Guo, Haoyu Zhang, Yuhan Zhao, Yiwen Sun, Zihao Jiao View a PDF of the paper titled Adaptive Value Decomposition: Coordinating a Varying Number of Agents in Urban Systems, by Yexin Li and Jinjin Guo and Haoyu Zhang and Yuhan Zhao and Yiwen Sun and Zihao Jiao View PDF HTML (experimental) Abstract:Multi-agent reinforcement learning (MARL) provides a promising paradigm for coordinating multi-agent systems (MAS). However, most existing methods rely on restrictive assumptions, such as a fixed number of agents and fully synchronous action execution. These assumptions are often violated in urban systems, where the number of active agents varies over time, and actions may have heterogeneous durations, resulting in a semi-MARL setting. Moreover, while sharing policy parameters among agents is commonly adopted to improve learning efficiency, it can lead to highly homogeneous actions when a subset of agents make decisions concurrently under similar observations, potentially degrading coordination quality. To address these challenges, we propose Adaptive Value Decomposition (AVD), a cooperative MARL framework that adapts to a dynamically changing agent population. AVD further incorporates a lightweight mechanism to mitigate action homogenization induced by shared policies, thereby enc...