[2602.14158] A Multi-Agent Framework for Medical AI: Leveraging Fine-Tuned GPT, LLaMA, and DeepSeek R1 for Evidence-Based and Bias-Aware Clinical Query Processing

[2602.14158] A Multi-Agent Framework for Medical AI: Leveraging Fine-Tuned GPT, LLaMA, and DeepSeek R1 for Evidence-Based and Bias-Aware Clinical Query Processing

arXiv - AI 4 min read Article

Summary

This article presents a multi-agent framework for medical AI that enhances clinical query processing by leveraging fine-tuned language models and evidence retrieval mechanisms.

Why It Matters

The integration of AI in healthcare is crucial for improving patient outcomes. This framework addresses significant limitations in current medical AI systems, such as verification and bias, making it a valuable contribution to the field. By enhancing the reliability of AI-generated answers, it can lead to better decision-making in clinical settings.

Key Takeaways

  • The framework combines multiple LLMs to improve clinical query processing.
  • Fine-tuning on MedQuAD data enhances the quality of medical QA.
  • Incorporates evidence retrieval and uncertainty estimation for reliable answers.
  • Achieves 87% accuracy and reduces uncertainty through evidence augmentation.
  • Includes safety mechanisms to detect bias and ensure factual consistency.

Computer Science > Computation and Language arXiv:2602.14158 (cs) [Submitted on 15 Feb 2026] Title:A Multi-Agent Framework for Medical AI: Leveraging Fine-Tuned GPT, LLaMA, and DeepSeek R1 for Evidence-Based and Bias-Aware Clinical Query Processing Authors:Naeimeh Nourmohammadi, Md Meem Hossain, The Anh Han, Safina Showkat Ara, Zia Ush Shamszaman View a PDF of the paper titled A Multi-Agent Framework for Medical AI: Leveraging Fine-Tuned GPT, LLaMA, and DeepSeek R1 for Evidence-Based and Bias-Aware Clinical Query Processing, by Naeimeh Nourmohammadi and 4 other authors View PDF HTML (experimental) Abstract:Large language models (LLMs) show promise for healthcare question answering, but clinical use is limited by weak verification, insufficient evidence grounding, and unreliable confidence signalling. We propose a multi-agent medical QA framework that combines complementary LLMs with evidence retrieval, uncertainty estimation, and bias checks to improve answer reliability. Our approach has two phases. First, we fine-tune three representative LLM families (GPT, LLaMA, and DeepSeek R1) on MedQuAD-derived medical QA data (20k+ question-answer pairs across multiple NIH domains) and benchmark generation quality. DeepSeek R1 achieves the strongest scores (ROUGE-1 0.536 +- 0.04; ROUGE-2 0.226 +-0.03; BLEU 0.098 -+ 0.018) and substantially outperforms the specialised biomedical baseline BioGPT in zero-shot evaluation. Second, we implement a modular multi-agent pipeline in which a C...

Related Articles

Llms

[R] GPT-5.4-mini regressed 22pp on vanilla prompting vs GPT-5-mini. Nobody noticed because benchmarks don't test this. Recursive Language Models solved it.

GPT-5.4-mini produces shorter, terser outputs by default. Vanilla accuracy dropped from 69.5% to 47.2% across 12 tasks (1,800 evals). The...

Reddit - Machine Learning · 1 min ·
Llms

built an open source CLI that auto generates AI setup files for your projects just hit 150 stars

hey everyone, been working on this side project called ai-setup and just hit a milestone i wanted to share 150 github stars, 90 PRs merge...

Reddit - Artificial Intelligence · 1 min ·
Llms

built an open source tool that auto generates AI context files for any codebase, 150 stars in

one of the most tedious parts of working with AI coding tools is having to manually write context files every single time. CLAUDE.md, .cu...

Reddit - Artificial Intelligence · 1 min ·
Find out what’s new in the Gemini app in March's Gemini Drop.
Llms

Find out what’s new in the Gemini app in March's Gemini Drop.

Gemini Drops is our regular monthly update on how to get the most out of the Gemini app.

AI Tools & Products · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime