[2603.20324] When Agents Disagree: The Selection Bottleneck in Multi-Agent LLM Pipelines
About this article
Abstract page for arXiv paper 2603.20324: When Agents Disagree: The Selection Bottleneck in Multi-Agent LLM Pipelines
Computer Science > Multiagent Systems arXiv:2603.20324 (cs) [Submitted on 20 Mar 2026] Title:When Agents Disagree: The Selection Bottleneck in Multi-Agent LLM Pipelines Authors:Artem Maryanskyy View a PDF of the paper titled When Agents Disagree: The Selection Bottleneck in Multi-Agent LLM Pipelines, by Artem Maryanskyy View PDF HTML (experimental) Abstract:Multi-agent LLM pipelines produce contradictory evidence on whether team diversity improves output quality: heterogeneous Mixture-of-Agents teams outperform single models, yet homogeneous Self-MoA teams consistently win under synthesis-based aggregation. We propose a resolution by identifying the selection bottleneck -- a crossover threshold in aggregation quality that determines whether diversity helps or hurts. Under this model, we obtain a closed-form crossover threshold $s^*$ (Proposition 1) that separates the regimes where diversity helps and hurts. In a targeted experiment spanning 42 tasks across 7 categories ($N=210$), a diverse team with judge-based selection achieves a win rate of 0.810 against a single-model baseline, while a homogeneous team scores 0.512 -- near chance (Glass's $\Delta = 2.07$). Judge-based selection outperforms MoA-style synthesis by $\Delta_{\mathrm{WR}} = +0.631$ -- the synthesis approach is preferred over the baseline in zero of 42 tasks by the judge panel. A decoupled evaluation with independent judges confirms all directional findings (Spearman $\rho = 0.90$). Exploratory evidence sugg...