[2602.22216] Retrieval-Augmented Generation Assistant for Anatomical Pathology Laboratories
Summary
This article discusses a Retrieval-Augmented Generation (RAG) assistant designed for Anatomical Pathology laboratories, enhancing access to laboratory protocols and improving workflow efficiency.
Why It Matters
The study addresses the critical need for accurate and efficient access to laboratory protocols in Anatomical Pathology, where diagnostic decisions heavily rely on timely information. By proposing a RAG assistant, it showcases a potential solution to reduce workflow errors and improve patient safety in healthcare settings.
Key Takeaways
- RAG assistants can transform static documentation into dynamic knowledge sources.
- Domain-specific embeddings significantly enhance answer relevance and accuracy.
- The study highlights the importance of tailored retrieval methods for healthcare applications.
Computer Science > Information Retrieval arXiv:2602.22216 (cs) [Submitted on 8 Dec 2025] Title:Retrieval-Augmented Generation Assistant for Anatomical Pathology Laboratories Authors:Diogo Pires, Yuriy Perezhohin, Mauro Castelli View a PDF of the paper titled Retrieval-Augmented Generation Assistant for Anatomical Pathology Laboratories, by Diogo Pires and 2 other authors View PDF HTML (experimental) Abstract:Accurate and efficient access to laboratory protocols is essential in Anatomical Pathology (AP), where up to 70% of medical decisions depend on laboratory diagnoses. However, static documentation such as printed manuals or PDFs is often outdated, fragmented, and difficult to search, creating risks of workflow errors and diagnostic delays. This study proposes and evaluates a Retrieval-Augmented Generation (RAG) assistant tailored to AP laboratories, designed to provide technicians with context-grounded answers to protocol-related queries. We curated a novel corpus of 99 AP protocols from a Portuguese healthcare institution and constructed 323 question-answer pairs for systematic evaluation. Ten experiments were conducted, varying chunking strategies, retrieval methods, and embedding models. Performance was assessed using the RAGAS framework (faithfulness, answer relevance, context recall) alongside top-k retrieval metrics. Results show that recursive chunking and hybrid retrieval delivered the strongest baseline performance. Incorporating a biomedical-specific embedding...