[2603.08819] Beyond Relevance: On the Relationship Between Retrieval and RAG Information Coverage

[2603.08819] Beyond Relevance: On the Relationship Between Retrieval and RAG Information Coverage

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2603.08819: Beyond Relevance: On the Relationship Between Retrieval and RAG Information Coverage

Computer Science > Information Retrieval arXiv:2603.08819 (cs) [Submitted on 9 Mar 2026 (v1), last revised 14 Apr 2026 (this version, v3)] Title:Beyond Relevance: On the Relationship Between Retrieval and RAG Information Coverage Authors:Saron Samuel, Alexander Martin, Eugene Yang, Andrew Yates, Dawn Lawrie, Laura Dietz, Benjamin Van Durme View a PDF of the paper titled Beyond Relevance: On the Relationship Between Retrieval and RAG Information Coverage, by Saron Samuel and 6 other authors View PDF Abstract:Retrieval-augmented generation (RAG) systems combine document retrieval with a generative model to address complex information seeking tasks like report generation. While the relationship between retrieval quality and generation effectiveness seems intuitive, it has not been systematically studied. We investigate whether upstream retrieval metrics can serve as reliable early indicators of the final generated response's information coverage. Through experiments across two text RAG benchmarks (TREC NeuCLIR 2024 and TREC RAG 2024) and one multimodal benchmark (WikiVideo), we analyze 15 text retrieval stacks and 10 multimodal retrieval stacks across four RAG pipelines and multiple evaluation frameworks (Auto-ARGUE and MiRAGE). Our findings demonstrate strong correlations between coverage-based retrieval metrics and nugget coverage in generated responses at both topic and system levels. This relationship holds most strongly when retrieval objectives align with generation goa...

Originally published on April 15, 2026. Curated by AI News.

Related Articles

Machine Learning

Trained a Qwen2.5-0.5B-Instruct bf16 model on Reddit post summarization task with GRPO written from scratch in PyTorch - updates! [P]

So, yesterday run was a success and I did get an avg rollout length of about 64 tokens as attached in the image! This was with quality_re...

Reddit - Machine Learning · 1 min ·
Machine Learning

Thoughts and experience on ML journals [D]

Recently I’ve been thinking about shifting from conferences to journals due to a few bad experiences with ML conferences reviewing proces...

Reddit - Machine Learning · 1 min ·
Llms

Gemini Robotics-ER 1.6: Powering real-world robotics tasks through enhanced embodied reasoning

Gemini Robotics-ER 1.6 is a significant upgrade to the reasoning-first model that enables robots to understand their environments with un...

Reddit - Artificial Intelligence · 1 min ·
[2603.10652] Are Video Reasoning Models Ready to Go Outside?
Llms

[2603.10652] Are Video Reasoning Models Ready to Go Outside?

Abstract page for arXiv paper 2603.10652: Are Video Reasoning Models Ready to Go Outside?

arXiv - AI · 4 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime