[2602.13255] DPBench: Large Language Models Struggle with Simultaneous Coordination
Summary
The paper introduces DPBench, a benchmark assessing how well large language models (LLMs) coordinate in multi-agent systems, revealing significant challenges in simultaneous decision-making.
Why It Matters
As LLMs are increasingly used in multi-agent environments, understanding their coordination capabilities is crucial. The findings highlight the limitations of current models in handling simultaneous tasks, suggesting a need for improved coordination mechanisms, which could impact future AI development and deployment.
Key Takeaways
- DPBench evaluates LLM coordination using the Dining Philosophers problem.
- LLMs perform well in sequential decision-making but struggle with simultaneous coordination, leading to high deadlock rates.
- Communication among agents does not necessarily improve coordination and may worsen deadlock situations.
- The study indicates a need for external coordination mechanisms in multi-agent LLM systems.
- DPBench is released as an open-source tool for further research.
Computer Science > Artificial Intelligence arXiv:2602.13255 (cs) [Submitted on 2 Feb 2026] Title:DPBench: Large Language Models Struggle with Simultaneous Coordination Authors:Najmul Hasan, Prashanth BusiReddyGari View a PDF of the paper titled DPBench: Large Language Models Struggle with Simultaneous Coordination, by Najmul Hasan and Prashanth BusiReddyGari View PDF HTML (experimental) Abstract:Large language models are increasingly deployed in multi-agent systems, yet we lack benchmarks that test whether they can coordinate under resource contention. We introduce DPBench, a benchmark based on the Dining Philosophers problem that evaluates LLM coordination across eight conditions that vary decision timing, group size, and communication. Our experiments with GPT-5.2, Claude Opus 4.5, and Grok 4.1 reveal a striking asymmetry: LLMs coordinate effectively in sequential settings but fail when decisions must be made simultaneously, with deadlock rates exceeding 95\% under some conditions. We trace this failure to convergent reasoning, where agents independently arrive at identical strategies that, when executed simultaneously, guarantee deadlock. Contrary to expectations, enabling communication does not resolve this problem and can even increase deadlock rates. Our findings suggest that multi-agent LLM systems requiring concurrent resource access may need external coordination mechanisms rather than relying on emergent coordination. DPBench is released as an open-source benchma...