[2603.23987] Can we generate portable representations for clinical time series data using LLMs?
About this article
Abstract page for arXiv paper 2603.23987: Can we generate portable representations for clinical time series data using LLMs?
Computer Science > Machine Learning arXiv:2603.23987 (cs) [Submitted on 25 Mar 2026] Title:Can we generate portable representations for clinical time series data using LLMs? Authors:Zongliang Ji, Yifei Sun, Andre Amaral, Anna Goldenberg, Rahul G. Krishnan View a PDF of the paper titled Can we generate portable representations for clinical time series data using LLMs?, by Zongliang Ji and Yifei Sun and Andre Amaral and Anna Goldenberg and Rahul G. Krishnan View PDF Abstract:Deploying clinical ML is slow and brittle: models that work at one hospital often degrade under distribution shifts at the next. In this work, we study a simple question -- can large language models (LLMs) create portable patient embeddings i.e. representations of patients enable a downstream predictor built on one hospital to be used elsewhere with minimal-to-no retraining and fine-tuning. To do so, we map from irregular ICU time series onto concise natural language summaries using a frozen LLM, then embed each summary with a frozen text embedding model to obtain a fixed length vector capable of serving as input to a variety of downstream predictors. Across three cohorts (MIMIC-IV, HIRID, PPICU), on multiple clinically grounded forecasting and classification tasks, we find that our approach is simple, easy to use and competitive with in-distribution with grid imputation, self-supervised representation learning, and time series foundation models, while exhibiting smaller relative performance drops when t...