[2603.11413] Evaluation format, not model capability, drives triage failure in the assessment of consumer health AI
Abstract page for arXiv paper 2603.11413: Evaluation format, not model capability, drives triage failure in the assessment of consumer health AI