[2602.15159] Learning Representations from Incomplete EHR Data with Dual-Masked Autoencoding

[2602.15159] Learning Representations from Incomplete EHR Data with Dual-Masked Autoencoding

arXiv - Machine Learning 3 min read Article

Summary

The paper presents the Augmented-Intrinsic Dual-Masked Autoencoder (AID-MAE), a novel method for learning from incomplete electronic health records (EHRs) by directly addressing missing data challenges in clinical time series.

Why It Matters

This research is significant as it tackles the common issue of incomplete EHR data, which can hinder effective machine learning applications in healthcare. By improving representation learning from sparse data, AID-MAE can enhance clinical decision-making and patient stratification.

Key Takeaways

  • AID-MAE effectively learns from incomplete EHR time series without prior imputation.
  • The model uses dual masking to represent missing values and enhance reconstruction.
  • It outperforms existing methods like XGBoost and DuETT across various clinical tasks.
  • The learned embeddings can stratify patient cohorts, aiding in personalized medicine.
  • This approach addresses the challenges of irregular sampling and heterogeneous missingness in EHR data.

Computer Science > Machine Learning arXiv:2602.15159 (cs) [Submitted on 16 Feb 2026] Title:Learning Representations from Incomplete EHR Data with Dual-Masked Autoencoding Authors:Xiao Xiang, David Restrepo, Hyewon Jeong, Yugang Jia, Leo Anthony Celi View a PDF of the paper titled Learning Representations from Incomplete EHR Data with Dual-Masked Autoencoding, by Xiao Xiang and 4 other authors View PDF HTML (experimental) Abstract:Learning from electronic health records (EHRs) time series is challenging due to irregular sam- pling, heterogeneous missingness, and the resulting sparsity of observations. Prior self-supervised meth- ods either impute before learning, represent missingness through a dedicated input signal, or optimize solely for imputation, reducing their capacity to efficiently learn representations that support clinical downstream tasks. We propose the Augmented-Intrinsic Dual-Masked Autoencoder (AID-MAE), which learns directly from incomplete time series by applying an intrinsic missing mask to represent naturally missing values and an augmented mask that hides a subset of observed values for reconstruction during training. AID-MAE processes only the unmasked subset of tokens and consistently outperforms strong baselines, including XGBoost and DuETT, across multiple clinical tasks on two datasets. In addition, the learned embeddings naturally stratify patient cohorts in the representation space. Comments: Subjects: Machine Learning (cs.LG) Cite as: arXiv:2602...

Related Articles

Ai Startups

This AI startup envisions 100 Million New People Making Videogames

submitted by /u/sharkymcstevenson2 [link] [comments]

Reddit - Artificial Intelligence · 1 min ·
Llms

A robot car with a Claude AI brain started a YouTube vlog about its own existence

Not a demo reel. Not a tutorial. A robot narrating its own experience — debugging, falling off shelves, questioning its identity. First-p...

Reddit - Artificial Intelligence · 1 min ·
Anthropic ramps up its political activities with a new PAC | TechCrunch
Ai Startups

Anthropic ramps up its political activities with a new PAC | TechCrunch

With the midterms right around the corner, the new group is positioned to back candidates who support the AI company's policy agenda.

TechCrunch - AI · 3 min ·
Anthropic buys biotech startup Coefficient Bio in $400M deal: Reports | TechCrunch
Ai Startups

Anthropic buys biotech startup Coefficient Bio in $400M deal: Reports | TechCrunch

Anthropic has purchased the stealth biotech AI startup Coefficient Bio in a $400 million stock deal, according to The Information and Eri...

TechCrunch - AI · 3 min ·
More in Ai Startups: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime