[2603.02153] Scaling Retrieval Augmented Generation with RAG Fusion:

[2603.02153] Scaling Retrieval Augmented Generation with RAG Fusion: Lessons from an Industry Deployment

arXiv - AI March 03, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.02153: Scaling Retrieval Augmented Generation with RAG Fusion: Lessons from an Industry Deployment

Computer Science > Information Retrieval arXiv:2603.02153 (cs) [Submitted on 2 Mar 2026] Title:Scaling Retrieval Augmented Generation with RAG Fusion: Lessons from an Industry Deployment Authors:Luigi Medrano, Arush Verma, Mukul Chhabra View a PDF of the paper titled Scaling Retrieval Augmented Generation with RAG Fusion: Lessons from an Industry Deployment, by Luigi Medrano and 2 other authors View PDF HTML (experimental) Abstract:Retrieval-Augmented Generation (RAG) systems commonly adopt retrieval fusion techniques such as multi-query retrieval and reciprocal rank fusion (RRF) to increase document recall, under the assumption that higher recall leads to better answer quality. While these methods show consistent gains in isolated retrieval benchmarks, their effectiveness under realistic production constraints remains underexplored. In this work, we evaluate retrieval fusion in a production-style RAG pipeline operating over an enterprise knowledge base, with fixed retrieval depth, re-ranking budgets, and latency constraints. Across multiple fusion configurations, we find that retrieval fusion does increase raw recall, but these gains are largely neutralized after re-ranking and truncation. In our setting, fusion variants fail to outperform single-query baselines on KB-level Top-$k$ accuracy, with Hit@10 decreasing from $0.51$ to $0.48$ in several configurations. Moreover, fusion introduces additional latency overhead due to query rewriting and larger candidate sets, witho...

Originally published on March 03, 2026. Curated by AI News.

Llms

[2603.17839] How do LLMs Compute Verbal Confidence

Abstract page for arXiv paper 2603.17839: How do LLMs Compute Verbal Confidence

arXiv - AI · 4 min · about 1 hour ago

Llms

[2602.03584] $V_0$: A Generalist Value Model for Any Policy at State Zero

Abstract page for arXiv paper 2602.03584: $V_0$: A Generalist Value Model for Any Policy at State Zero

arXiv - AI · 4 min · about 1 hour ago

Llms

[2601.04448] Merging Triggers, Breaking Backdoors: Defensive Poisoning for Instruction-Tuned Language Models

Abstract page for arXiv paper 2601.04448: Merging Triggers, Breaking Backdoors: Defensive Poisoning for Instruction-Tuned Language Models

arXiv - AI · 3 min · about 1 hour ago

Llms

[2512.05411] A Systematic Framework for Enterprise Knowledge Retrieval: Leveraging LLM-Generated Metadata to Enhance RAG Systems

Abstract page for arXiv paper 2512.05411: A Systematic Framework for Enterprise Knowledge Retrieval: Leveraging LLM-Generated Metadata to...

arXiv - AI · 4 min · about 1 hour ago

[2603.02153] Scaling Retrieval Augmented Generation with RAG Fusion: Lessons from an Industry Deployment

About this article

Related Articles

[2603.17839] How do LLMs Compute Verbal Confidence

[2602.03584] $V_0$: A Generalist Value Model for Any Policy at State Zero

[2601.04448] Merging Triggers, Breaking Backdoors: Defensive Poisoning for Instruction-Tuned Language Models

[2512.05411] A Systematic Framework for Enterprise Knowledge Retrieval: Leveraging LLM-Generated Metadata to Enhance RAG Systems

No comments

Stay updated with AI News