[2605.07517] LARAG: Link-Aware Retrieval Strategy for RAG Systems in Hyperlinked Technical Documentation
About this article
Abstract page for arXiv paper 2605.07517: LARAG: Link-Aware Retrieval Strategy for RAG Systems in Hyperlinked Technical Documentation
Computer Science > Information Retrieval arXiv:2605.07517 (cs) [Submitted on 8 May 2026] Title:LARAG: Link-Aware Retrieval Strategy for RAG Systems in Hyperlinked Technical Documentation Authors:Giorgia Bolognesi, Claudio Estatico, Ulderico Fugacci, Isabella Mastroianni, Claudio Muselli, Luca Oneto View a PDF of the paper titled LARAG: Link-Aware Retrieval Strategy for RAG Systems in Hyperlinked Technical Documentation, by Giorgia Bolognesi and 5 other authors View PDF HTML (experimental) Abstract:Retrieval-Augmented Generation (RAG) enhances the factual grounding of Large Language Models by conditioning their outputs on external documents. However, standard embedding-based retrievers treat naturally structured corpora, such as technical manuals, as flat collections of passages, thereby overlooking the hyperlink topology that users rely on when navigating such content. We introduce LARAG (Link-Aware RAG): a lightweight, link-aware retrieval strategy that leverages the author-defined hyperlink structure already present in HTML documentation, encoding hyperlink relations as metadata in the chunk representations and exploiting them to perform a form of graph-like retrieval of locally relevant content. In a benchmark of twenty expert-designed queries over Rulex Platform technical documentation and four prompting strategies, LARAG consistently improves answer quality, achieving the highest BERTScore F1, while retrieving fewer chunks and generating fewer tokens than a baseline R...