[2603.22900] Off-Policy Evaluation and Learning for Survival Outcomes under Censoring
About this article
Abstract page for arXiv paper 2603.22900: Off-Policy Evaluation and Learning for Survival Outcomes under Censoring
Statistics > Methodology arXiv:2603.22900 (stat) [Submitted on 24 Mar 2026] Title:Off-Policy Evaluation and Learning for Survival Outcomes under Censoring Authors:Kohsuke Kubota, Mitsuhiro Takahashi, Yuta Saito View a PDF of the paper titled Off-Policy Evaluation and Learning for Survival Outcomes under Censoring, by Kohsuke Kubota and 2 other authors View PDF HTML (experimental) Abstract:Optimizing survival outcomes, such as patient survival or customer retention, is a critical objective in data-driven decision-making. Off-Policy Evaluation~(OPE) provides a powerful framework for assessing such decision-making policies using logged data alone, without the need for costly or risky online experiments in high-stakes applications. However, typical estimators are not designed to handle right-censored survival outcomes, as they ignore unobserved survival times beyond the censoring time, leading to systematic underestimation of the true policy performance. To address this issue, we propose a novel framework for OPE and Off-Policy Learning~(OPL) tailored for survival outcomes under censoring. Specifically, we introduce IPCW-IPS and IPCW-DR, which employ the Inverse Probability of Censoring Weighting technique to explicitly deal with censoring bias. We theoretically establish that our estimators are unbiased and that IPCW-DR achieves double robustness, ensuring consistency if either the propensity score or the outcome model is correct. Furthermore, we extend this framework to cons...