[2511.03255] Generative deep learning for foundational video translation in ultrasound
About this article
Abstract page for arXiv paper 2511.03255: Generative deep learning for foundational video translation in ultrasound
Computer Science > Computer Vision and Pattern Recognition arXiv:2511.03255 (cs) [Submitted on 5 Nov 2025 (v1), last revised 25 Mar 2026 (this version, v2)] Title:Generative deep learning for foundational video translation in ultrasound Authors:Nikolina Tomic, Roshni Bhatnagar, Sarthak Jain, Connor Lau, Tien-Yu Liu, Laura Gambini, Rima Arnaout View a PDF of the paper titled Generative deep learning for foundational video translation in ultrasound, by Nikolina Tomic and 6 other authors View PDF Abstract:Deep learning (DL) has the potential to revolutionize image acquisition and interpretation across medicine, however, attention to data imbalance and missingness is required. Ultrasound data presents a particular challenge because in addition to different views and structures, it includes several sub-modalities-such as greyscale and color flow doppler (CFD)-that are often imbalanced in clinical studies. Image translation can help balance datasets but is challenging for ultrasound sub-modalities to date. Here, we present a generative method for ultrasound CFD-greyscale video translation, trained on 54,975 videos and tested on 8,368. The method developed leveraged pixel-wise, adversarial, and perceptual loses and utilized two networks: one for reconstructing anatomic structures and one for denoising to achieve realistic ultrasound imaging. Average pairwise SSIM between synthetic videos and ground truth was 0.91+/-0.04. Synthetic videos performed indistinguishably from real ones...