[2603.04751] Evaluating the Search Agent in a Parallel World
About this article
Abstract page for arXiv paper 2603.04751: Evaluating the Search Agent in a Parallel World
Computer Science > Artificial Intelligence arXiv:2603.04751 (cs) [Submitted on 5 Mar 2026] Title:Evaluating the Search Agent in a Parallel World Authors:Jiawei Chen, Xintian Shen, Lihao Zheng, Lifu Mu, Haoyi Sun, Ning Mao, Hao Ma, Tao Wei, Pan Zhou, Kun Zhan View a PDF of the paper titled Evaluating the Search Agent in a Parallel World, by Jiawei Chen and 9 other authors View PDF HTML (experimental) Abstract:Integrating web search tools has significantly extended the capability of LLMs to address open-world, real-time, and long-tail problems. However, evaluating these Search Agents presents formidable challenges. First, constructing high-quality deep search benchmarks is prohibitively expensive, while unverified synthetic data often suffers from unreliable sources. Second, static benchmarks face dynamic obsolescence: as internet information evolves, complex queries requiring deep research often degrade into simple retrieval tasks due to increased popularity, and ground truths become outdated due to temporal shifts. Third, attribution ambiguity confounds evaluation, as an agent's performance is often dominated by its parametric memory rather than its actual search and reasoning capabilities. Finally, reliance on specific commercial search engines introduces variability that hampers reproducibility. To address these issues, we propose a novel framework, Mind-ParaWorld, for evaluating Search Agents in a Parallel World. Specifically, MPW samples real-world entity names to synt...