[2602.21142] LUMEN: Longitudinal Multi-Modal Radiology Model for Prognosis and Diagnosis
Summary
The LUMEN model enhances radiological diagnosis by leveraging longitudinal imaging data and multi-modal training, improving prognostic capabilities in clinical settings.
Why It Matters
As healthcare increasingly relies on advanced technologies, the LUMEN model represents a significant step towards integrating AI in radiology. By improving diagnostic accuracy and efficiency, it has the potential to transform patient outcomes and streamline clinical workflows, addressing the challenges of manual longitudinal analysis.
Key Takeaways
- LUMEN optimizes longitudinal chest X-ray interpretation using AI.
- The model improves diagnostic and prognostic performance through multi-task instruction fine-tuning.
- Experiments show significant advancements over baseline models in visual question-answering tasks.
- The development of a novel instruction-following dataset enhances the model's capabilities.
- LUMEN demonstrates the potential of AI in providing clinically meaningful insights from radiological data.
Computer Science > Computer Vision and Pattern Recognition arXiv:2602.21142 (cs) [Submitted on 24 Feb 2026] Title:LUMEN: Longitudinal Multi-Modal Radiology Model for Prognosis and Diagnosis Authors:Zhifan Jiang, Dong Yang, Vishwesh Nath, Abhijeet Parida, Nishad P. Kulkarni, Ziyue Xu, Daguang Xu, Syed Muhammad Anwar, Holger R. Roth, Marius George Linguraru View a PDF of the paper titled LUMEN: Longitudinal Multi-Modal Radiology Model for Prognosis and Diagnosis, by Zhifan Jiang and 9 other authors View PDF HTML (experimental) Abstract:Large vision-language models (VLMs) have evolved from general-purpose applications to specialized use cases such as in the clinical domain, demonstrating potential for decision support in radiology. One promising application is assisting radiologists in decision-making by the analysis of radiology imaging data such as chest X-rays (CXR) via a visual and natural language question-answering (VQA) interface. When longitudinal imaging is available, radiologists analyze temporal changes, which are essential for accurate diagnosis and prognosis. The manual longitudinal analysis is a time-consuming process, motivating the development of a training framework that can provide prognostic capabilities. We introduce a novel training framework LUMEN, that is optimized for longitudinal CXR interpretation, leveraging multi-image and multi-task instruction fine-tuning to enhance prognostic and diagnostic performance. We conduct experiments on the publicly ava...