[2604.08559] Medical Reasoning with Large Language Models: A Survey and MR-Bench
About this article
Abstract page for arXiv paper 2604.08559: Medical Reasoning with Large Language Models: A Survey and MR-Bench
Computer Science > Computation and Language arXiv:2604.08559 (cs) [Submitted on 17 Mar 2026] Title:Medical Reasoning with Large Language Models: A Survey and MR-Bench Authors:Xiaohan Ren, Chenxiao Fan, Wenyin Ma, Hongliang He, Chongming Gao, Xiaoyan Zhao, Fuli Feng View a PDF of the paper titled Medical Reasoning with Large Language Models: A Survey and MR-Bench, by Xiaohan Ren and 6 other authors View PDF HTML (experimental) Abstract:Large language models (LLMs) have achieved strong performance on medical exam-style tasks, motivating growing interest in their deployment in real-world clinical settings. However, clinical decision-making is inherently safety-critical, context-dependent, and conducted under evolving evidence. In such situations, reliable LLM performance depends not on factual recall alone, but on robust medical reasoning. In this work, we present a comprehensive review of medical reasoning with LLMs. Grounded in cognitive theories of clinical reasoning, we conceptualize medical reasoning as an iterative process of abduction, deduction, and induction, and organize existing methods into seven major technical routes spanning training-based and training-free approaches. We further conduct a unified cross-benchmark evaluation of representative medical reasoning models under a consistent experimental setting, enabling a more systematic and comparable assessment of the empirical impact of existing methods. To better assess clinically grounded reasoning, we introduc...