[2512.13742] DL$^3$M: A Vision-to-Language Framework for Expert-Level Medical Reasoning through Deep Learning and Large Language Models

[2512.13742] DL$^3$M: A Vision-to-Language Framework for Expert-Level Medical Reasoning through Deep Learning and Large Language Models

arXiv - AI 4 min read Article

Summary

The DL$^3$M framework integrates deep learning and large language models to enhance medical reasoning from images, addressing limitations in current AI models for clinical applications.

Why It Matters

This research highlights the potential of combining deep learning with language models to improve clinical reasoning in medical diagnostics. It underscores the importance of reliable AI in high-stakes medical environments, where accurate explanations are crucial for decision-making.

Key Takeaways

  • DL$^3$M links image classification with structured clinical reasoning.
  • MobileCoAtNet achieves high accuracy in classifying gastrointestinal diseases.
  • Current LLMs struggle with stability and reliability in medical reasoning.
  • Expert-verified benchmarks were created to evaluate LLM reasoning.
  • The framework provides insights into the limitations of AI in medical contexts.

Computer Science > Computer Vision and Pattern Recognition arXiv:2512.13742 (cs) This paper has been withdrawn by Md. Najib Hasan [Submitted on 14 Dec 2025 (v1), last revised 22 Feb 2026 (this version, v2)] Title:DL$^3$M: A Vision-to-Language Framework for Expert-Level Medical Reasoning through Deep Learning and Large Language Models Authors:Md. Najib Hasan (1), Imran Ahmad (1), Sourav Basak Shuvo (2), Md. Mahadi Hasan Ankon (2), Sunanda Das (3), Nazmul Siddique (4), Hui Wang (5) ((1) Wichita State University, USA, (2) Khulna University of Engineering and Technology, Bangladesh, (3) University of Arkansas, USA, (4) Ulster University, UK, (5) Queen's University Belfast, UK) View a PDF of the paper titled DL$^3$M: A Vision-to-Language Framework for Expert-Level Medical Reasoning through Deep Learning and Large Language Models, by Md. Najib Hasan (1) and 15 other authors No PDF available, click to view other formats Abstract:Medical image classifiers detect gastrointestinal diseases well, but they do not explain their decisions. Large language models can generate clinical text, yet they struggle with visual reasoning and often produce unstable or incorrect explanations. This leaves a gap between what a model sees and the type of reasoning a clinician expects. We introduce a framework that links image classification with structured clinical reasoning. A new hybrid model, MobileCoAtNet, is designed for endoscopic images and achieves high accuracy across eight stomach-related cl...

Related Articles

De-aged casts, ChatGPT-generated programs: How AI is changing Korean TV
Llms

De-aged casts, ChatGPT-generated programs: How AI is changing Korean TV

Artificial intelligence is transforming every corner of industry, and television is no exception. Major networks in Korea have recently a...

AI Tools & Products · 4 min ·
[2603.16629] MLLM-based Textual Explanations for Face Comparison
Llms

[2603.16629] MLLM-based Textual Explanations for Face Comparison

Abstract page for arXiv paper 2603.16629: MLLM-based Textual Explanations for Face Comparison

arXiv - AI · 4 min ·
[2603.15159] To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation
Llms

[2603.15159] To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation

Abstract page for arXiv paper 2603.15159: To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation

arXiv - AI · 4 min ·
[2602.08316] SWE Context Bench: A Benchmark for Context Learning in Coding
Llms

[2602.08316] SWE Context Bench: A Benchmark for Context Learning in Coding

Abstract page for arXiv paper 2602.08316: SWE Context Bench: A Benchmark for Context Learning in Coding

arXiv - AI · 4 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime