[2409.12709] SeqRisk: Transformer-augmented latent variable model for robust survival prediction with longitudinal data
Summary
SeqRisk introduces a transformer-augmented latent variable model for enhanced survival prediction using longitudinal healthcare data, addressing limitations of traditional methods.
Why It Matters
This research is significant as it leverages advanced machine learning techniques to improve risk assessment in healthcare, particularly for high-risk patients. By utilizing longitudinal data, SeqRisk aims to provide more accurate predictions, which can lead to better patient outcomes and more effective healthcare interventions.
Key Takeaways
- SeqRisk combines variational autoencoders with transformer models for improved survival predictions.
- The model effectively handles irregular and sparse longitudinal data, enhancing predictive accuracy.
- SeqRisk shows robust performance even with increasing data sparsity compared to traditional methods.
- Partial explainability is provided, aiding in understanding patient risk profiles.
- The approach is particularly relevant for clinical applications in healthcare risk assessment.
Computer Science > Machine Learning arXiv:2409.12709 (cs) [Submitted on 19 Sep 2024 (v1), last revised 19 Feb 2026 (this version, v3)] Title:SeqRisk: Transformer-augmented latent variable model for robust survival prediction with longitudinal data Authors:Mine Öğretir, Miika Koskinen, Juha Sinisalo, Risto Renkonen, Harri Lähdesmäki View a PDF of the paper titled SeqRisk: Transformer-augmented latent variable model for robust survival prediction with longitudinal data, by Mine \"O\u{g}retir and 3 other authors View PDF HTML (experimental) Abstract:In healthcare, risk assessment of patient outcomes has been based on survival analysis for a long time, i.e. modeling time-to-event associations. However, conventional approaches rely on data from a single time-point, making them suboptimal for fully leveraging longitudinal patient history and capturing temporal regularities. Focusing on clinical real-world data and acknowledging its challenges, we utilize latent variable models to effectively handle irregular, noisy, and sparsely observed longitudinal data. We propose SeqRisk, a method that combines variational autoencoder (VAE) or longitudinal VAE (LVAE) with a transformer-based sequence aggregation and Cox proportional hazards module for risk prediction. SeqRisk captures long-range interactions, enhances predictive accuracy and generalizability, as well as provides partial explainability for sample population characteristics in attempts to identify high-risk patients. SeqRisk d...