[2603.24617] Multi-LLM Query Optimization
About this article
Abstract page for arXiv paper 2603.24617: Multi-LLM Query Optimization
Computer Science > Data Structures and Algorithms arXiv:2603.24617 (cs) [Submitted on 24 Mar 2026] Title:Multi-LLM Query Optimization Authors:Arlen Dean, Zijin Zhang, Stefanus Jasin, Yuqing Liu View a PDF of the paper titled Multi-LLM Query Optimization, by Arlen Dean and 3 other authors View PDF HTML (experimental) Abstract:Deploying multiple large language models (LLMs) in parallel to classify an unknown ground-truth label is a common practice, yet the problem of optimally allocating queries across heterogeneous models remains poorly understood. In this paper, we formulate a robust, offline query-planning problem that minimizes total query cost subject to statewise error constraints which guarantee reliability for every possible ground-truth label. We first establish that this problem is NP-hard via a reduction from the minimum-weight set cover problem. To overcome this intractability, we develop a surrogate by combining a union bound decomposition of the multi-class error into pairwise comparisons with Chernoff-type concentration bounds. The resulting surrogate admits a closed-form, multiplicatively separable expression in the query counts and is guaranteed to be feasibility-preserving. We further show that the surrogate is asymptotically tight at the optimization level: the ratio of surrogate-optimal cost to true optimal cost converges to one as error tolerances shrink, with an explicit rate of $O\left(\log\log(1/\alpha_{\min}) / \log(1/\alpha_{\min})\right)$. Finally,...