[2603.24481] Multi-Agent Reasoning with Consistency Verification Improves Uncertainty Calibration in Medical MCQA
About this article
Abstract page for arXiv paper 2603.24481: Multi-Agent Reasoning with Consistency Verification Improves Uncertainty Calibration in Medical MCQA
Computer Science > Artificial Intelligence arXiv:2603.24481 (cs) [Submitted on 25 Mar 2026] Title:Multi-Agent Reasoning with Consistency Verification Improves Uncertainty Calibration in Medical MCQA Authors:John Ray B. Martinez View a PDF of the paper titled Multi-Agent Reasoning with Consistency Verification Improves Uncertainty Calibration in Medical MCQA, by John Ray B. Martinez View PDF HTML (experimental) Abstract:Miscalibrated confidence scores are a practical obstacle to deploying AI in clinical settings. A model that is always overconfident offers no useful signal for deferral. We present a multi-agent framework that combines domain-specific specialist agents with Two-Phase Verification and S-Score Weighted Fusion to improve both calibration and discrimination in medical multiple-choice question answering. Four specialist agents (respiratory, cardiology, neurology, gastroenterology) generate independent diagnoses using Qwen2.5-7B-Instruct. Each diagnosis is then subjected to a two-phase self-verification process that measures internal consistency and produces a Specialist Confidence Score (S-score). The S-scores drive a weighted fusion strategy that selects the final answer and calibrates the reported confidence. We evaluate across four experimental settings, covering 100-question and 250-question high-disagreement subsets of both MedQA-USMLE and MedMCQA. Calibration improvement is the central finding, with ECE reduced by 49-74% across all four settings, including ...