[2602.13857] sleep2vec: Unified Cross-Modal Alignment for Heterogeneous Nocturnal Biosignals
Summary
The paper presents sleep2vec, a model for aligning diverse nocturnal biosignals to improve sleep staging and clinical assessments, addressing challenges of sensor dropout and device heterogeneity.
Why It Matters
This research is significant as it tackles the complexities of integrating various biosignal data, which is crucial for enhancing sleep analysis and clinical diagnostics. By providing a unified model, it opens avenues for more effective monitoring and treatment of sleep disorders.
Key Takeaways
- sleep2vec offers a unified model for diverse nocturnal biosignals.
- It addresses challenges of sensor dropout and device heterogeneity.
- The model shows improved performance in sleep staging and clinical assessments.
- Incorporates demographic and physiological metadata for enhanced learning.
- Establishes scaling laws for nocturnal biosignals regarding modality diversity.
Computer Science > Machine Learning arXiv:2602.13857 (cs) [Submitted on 14 Feb 2026] Title:sleep2vec: Unified Cross-Modal Alignment for Heterogeneous Nocturnal Biosignals Authors:Weixuan Yuan, Zengrui Jin, Yichen Wang, Donglin Xie, Ziyi Ye, Chao Zhang, Xuesong Chen View a PDF of the paper titled sleep2vec: Unified Cross-Modal Alignment for Heterogeneous Nocturnal Biosignals, by Weixuan Yuan and 6 other authors View PDF HTML (experimental) Abstract:Tasks ranging from sleep staging to clinical diagnosis traditionally rely on standard polysomnography (PSG) devices, bedside monitors and wearable devices, which capture diverse nocturnal biosignals (e.g., EEG, EOG, ECG, SpO$_2$). However, heterogeneity across devices and frequent sensor dropout pose significant challenges for unified modelling of these multimodal signals. We present \texttt{sleep2vec}, a foundation model for diverse and incomplete nocturnal biosignals that learns a shared representation via cross-modal alignment. \texttt{sleep2vec} is contrastively pre-trained on 42,249 overnight recordings spanning nine modalities using a \textit{Demography, Age, Site \& History-aware InfoNCE} objective that incorporates physiological and acquisition metadata (\textit{e.g.}, age, gender, recording site) to dynamically weight negatives and mitigate cohort-specific shortcuts. On downstream sleep staging and clinical outcome assessment, \texttt{sleep2vec} consistently outperforms strong baselines and remains robust to any subset o...