[2603.18123] Understanding Task Aggregation for Generalizable Ultrasound Foundation Models
About this article
Abstract page for arXiv paper 2603.18123: Understanding Task Aggregation for Generalizable Ultrasound Foundation Models
Electrical Engineering and Systems Science > Image and Video Processing arXiv:2603.18123 (eess) [Submitted on 18 Mar 2026 (v1), last revised 20 Mar 2026 (this version, v2)] Title:Understanding Task Aggregation for Generalizable Ultrasound Foundation Models Authors:Fangyijie Wang, Tanya Akumu, Vien Ngoc Dang, Amelia Jiménez-Sánchez, Jieyun Bai, Guénolé Silvestre, Karim Lekadir, Kathleen M. Curran View a PDF of the paper titled Understanding Task Aggregation for Generalizable Ultrasound Foundation Models, by Fangyijie Wang and 7 other authors View PDF HTML (experimental) Abstract:Foundation models promise to unify multiple clinical tasks within a single framework, but recent ultrasound studies report that unified models can underperform task-specific baselines. We hypothesize that this degradation arises not from model capacity limitations, but from task aggregation strategies that ignore interactions between task heterogeneity and available training data scale. In this work, we systematically analyze when heterogeneous ultrasound tasks can be jointly learned without performance loss, establishing practical criteria for task aggregation in unified clinical imaging models. We introduce M2DINO, a multi-organ, multi-task framework built on DINOv3 with task-conditioned Mixture-of-Experts blocks for adaptive capacity allocation. We systematically evaluate 27 ultrasound tasks spanning segmentation, classification, detection, and regression under three paradigms: task-specific, cli...