[2408.09743] R2GenCSR: Mining Contextual and Residual Information for LLMs-based Radiology Report Generation
About this article
Abstract page for arXiv paper 2408.09743: R2GenCSR: Mining Contextual and Residual Information for LLMs-based Radiology Report Generation
Computer Science > Computer Vision and Pattern Recognition arXiv:2408.09743 (cs) [Submitted on 19 Aug 2024 (v1), last revised 27 Feb 2026 (this version, v2)] Title:R2GenCSR: Mining Contextual and Residual Information for LLMs-based Radiology Report Generation Authors:Xiao Wang, Yuehang Li, Fuling Wang, Shiao Wang, Chuanfu Li, Bo Jiang View a PDF of the paper titled R2GenCSR: Mining Contextual and Residual Information for LLMs-based Radiology Report Generation, by Xiao Wang and 5 other authors View PDF HTML (experimental) Abstract:Inspired by the tremendous success of Large Language Models (LLMs), existing Radiology report generation methods attempt to leverage large models to achieve better performance. They usually adopt a Transformer to extract the visual features of a given X-ray image, and then, feed them into the LLM for text generation. How to extract more effective information for the LLMs to help them improve final results is an urgent problem that needs to be solved. Additionally, the use of visual Transformer models also brings high computational complexity. To address these issues, this paper proposes a novel context-guided efficient radiology report generation framework. Specifically, we introduce the Mamba as the vision backbone with linear complexity, and the performance obtained is comparable to that of the strong Transformer model. More importantly, we perform context retrieval from the training set for samples within each mini-batch during the training pha...