[2602.19322] US-JEPA: A Joint Embedding Predictive Architecture for Medical Ultrasound

[2602.19322] US-JEPA: A Joint Embedding Predictive Architecture for Medical Ultrasound

arXiv - Machine Learning 4 min read Article

Summary

The paper presents US-JEPA, a novel self-supervised framework for medical ultrasound imaging that enhances representation learning by predicting masked latent representations, addressing challenges posed by noise and low signal-to-noise ratios.

Why It Matters

This research is significant as it introduces a new approach to improving ultrasound imaging analysis, which is crucial for accurate medical diagnoses. By overcoming limitations of existing methods, US-JEPA could enhance the reliability of ultrasound as a diagnostic tool, potentially impacting patient care and outcomes.

Key Takeaways

  • US-JEPA utilizes a Static-teacher Asymmetric Latent Training (SALT) objective for stable latent target predictions.
  • The framework shows competitive performance against existing ultrasound models on the UltraBench dataset.
  • Masked latent prediction is proposed as a more efficient method for robust ultrasound representation learning.
  • The paper provides a comprehensive comparison of state-of-the-art ultrasound models, contributing to the field's understanding.
  • This research could lead to improved diagnostic capabilities in medical imaging.

Computer Science > Computer Vision and Pattern Recognition arXiv:2602.19322 (cs) [Submitted on 22 Feb 2026] Title:US-JEPA: A Joint Embedding Predictive Architecture for Medical Ultrasound Authors:Ashwath Radhachandran, Vedrana Ivezić, Shreeram Athreya, Ronit Anilkumar, Corey W. Arnold, William Speier View a PDF of the paper titled US-JEPA: A Joint Embedding Predictive Architecture for Medical Ultrasound, by Ashwath Radhachandran and 5 other authors View PDF HTML (experimental) Abstract:Ultrasound (US) imaging poses unique challenges for representation learning due to its inherently noisy acquisition process. The low signal-to-noise ratio and stochastic speckle patterns hinder standard self-supervised learning methods relying on a pixel-level reconstruction objective. Joint-Embedding Predictive Architectures (JEPAs) address this drawback by predicting masked latent representations rather than raw pixels. However, standard approaches depend on hyperparameter-brittle and computationally expensive online teachers updated via exponential moving average. We propose US-JEPA, a self-supervised framework that adopts the Static-teacher Asymmetric Latent Training (SALT) objective. By using a frozen, domain-specific teacher to provide stable latent targets, US-JEPA decouples student-teacher optimization and pushes the student to expand upon the semantic priors of the teacher. In addition, we provide the first rigorous comparison of all publicly available state-of-the-art ultrasound fo...

Related Articles

[2601.13227] Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?
Llms

[2601.13227] Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

Abstract page for arXiv paper 2601.13227: Insider Knowledge: How Much Can RAG Systems Gain from Evaluation Secrets?

arXiv - AI · 3 min ·
[2601.22440] AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations
Llms

[2601.22440] AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations

Abstract page for arXiv paper 2601.22440: AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Value...

arXiv - AI · 4 min ·
[2601.13222] Incorporating Q&A Nuggets into Retrieval-Augmented Generation
Nlp

[2601.13222] Incorporating Q&A Nuggets into Retrieval-Augmented Generation

Abstract page for arXiv paper 2601.13222: Incorporating Q&A Nuggets into Retrieval-Augmented Generation

arXiv - AI · 3 min ·
[2512.01707] StreamGaze: Gaze-Guided Temporal Reasoning and Proactive Understanding in Streaming Videos
Llms

[2512.01707] StreamGaze: Gaze-Guided Temporal Reasoning and Proactive Understanding in Streaming Videos

Abstract page for arXiv paper 2512.01707: StreamGaze: Gaze-Guided Temporal Reasoning and Proactive Understanding in Streaming Videos

arXiv - AI · 4 min ·
More in Nlp: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime