[2604.07274] A Systematic Study of Retrieval Pipeline Design for Retrieval-Augmented Medical Question Answering
About this article
Abstract page for arXiv paper 2604.07274: A Systematic Study of Retrieval Pipeline Design for Retrieval-Augmented Medical Question Answering
Computer Science > Computation and Language arXiv:2604.07274 (cs) [Submitted on 8 Apr 2026] Title:A Systematic Study of Retrieval Pipeline Design for Retrieval-Augmented Medical Question Answering Authors:Nusrat Sultana, Abdullah Muhammad Moosa, Kazi Afzalur Rahman, Sajal Chandra Banik View a PDF of the paper titled A Systematic Study of Retrieval Pipeline Design for Retrieval-Augmented Medical Question Answering, by Nusrat Sultana and 3 other authors View PDF Abstract:Large language models (LLMs) have demonstrated strong capabilities in medical question answering; however, purely parametric models often suffer from knowledge gaps and limited factual grounding. Retrieval-augmented generation (RAG) addresses this limitation by integrating external knowledge retrieval into the reasoning process. Despite increasing interest in RAG-based medical systems, the impact of individual retrieval components on performance remains insufficiently understood. This study presents a systematic evaluation of retrieval-augmented medical question answering using the MedQA USMLE benchmark and a structured textbook-based knowledge corpus. We analyze the interaction between language models, embedding models, retrieval strategies, query reformulation, and cross-encoder reranking within a unified experimental framework comprising forty configurations. Results show that retrieval augmentation significantly improves zero-shot medical question answering performance. The best-performing configuration ...