[2408.02839] Mini-batch Estimation for Deep Cox Models: Statistical Foundations and Practical Guidance
About this article
Abstract page for arXiv paper 2408.02839: Mini-batch Estimation for Deep Cox Models: Statistical Foundations and Practical Guidance
Statistics > Machine Learning arXiv:2408.02839 (stat) [Submitted on 5 Aug 2024 (v1), last revised 30 Mar 2026 (this version, v5)] Title:Mini-batch Estimation for Deep Cox Models: Statistical Foundations and Practical Guidance Authors:Lang Zeng, Weijing Tang, Zhao Ren, Ying Ding View a PDF of the paper titled Mini-batch Estimation for Deep Cox Models: Statistical Foundations and Practical Guidance, by Lang Zeng and 3 other authors View PDF HTML (experimental) Abstract:The stochastic gradient descent (SGD) algorithm has been widely used to optimize deep Cox neural network (Cox-NN) by updating model parameters using mini-batches of data. We show that SGD aims to optimize the average of mini-batch partial-likelihood, which is different from the standard partial-likelihood. This distinction requires developing new statistical properties for the global optimizer, namely, the mini-batch maximum partial-likelihood estimator (mb-MPLE). We establish that mb-MPLE for Cox-NN is consistent and achieves the optimal minimax convergence rate up to a polylogarithmic factor. For Cox regression with linear covariate effects, we further show that mb-MPLE is $\sqrt{n}$-consistent and asymptotically normal with asymptotic variance approaching the information lower bound as batch size increases, which is confirmed by simulation studies. Additionally, we offer practical guidance on using SGD, supported by theoretical analysis and numerical evidence. For Cox-NN, we demonstrate that the ratio of th...