[2509.25095] Benchmarking ECG FMs: A Reality Check Across Clinical Tasks
About this article
Abstract page for arXiv paper 2509.25095: Benchmarking ECG FMs: A Reality Check Across Clinical Tasks
Electrical Engineering and Systems Science > Signal Processing arXiv:2509.25095 (eess) [Submitted on 29 Sep 2025 (v1), last revised 4 Mar 2026 (this version, v2)] Title:Benchmarking ECG FMs: A Reality Check Across Clinical Tasks Authors:M A Al-Masud, Juan Miguel Lopez Alcaraz, Nils Strodthoff View a PDF of the paper titled Benchmarking ECG FMs: A Reality Check Across Clinical Tasks, by M A Al-Masud and 2 other authors View PDF HTML (experimental) Abstract:The 12-lead electrocardiogram (ECG) is a long-standing diagnostic tool. Yet machine learning for ECG interpretation remains fragmented, often limited to narrow tasks or datasets. FMs promise broader adaptability, but fundamental questions remain: Which architectures generalize best? How do models scale with limited labels? What explains performance differences across model families? We benchmarked eight ECG FMs on 26 clinically relevant tasks using 12 public datasets comprising 1,650 regression and classification targets. Models were evaluated under fine-tuning and frozen settings, with scaling analyses across dataset sizes. Results show heterogeneous performance across domains: in adult ECG interpretation, three FMs consistently outperformed strong supervised baselines. In contrast, ECG-CPC, a compact structured state-space model, dominated 5 of 7 task categories, demonstrating that architecture matters more than scale. FMs improved label efficiency 3.3-9x over supervised baselines, though scaling behaviors varied across...