[2408.02839] Mini-batch Estimation for Deep Cox Models: Statistical

[2408.02839] Mini-batch Estimation for Deep Cox Models: Statistical Foundations and Practical Guidance

arXiv - Machine Learning March 31, 2026 4 min read

About this article

Abstract page for arXiv paper 2408.02839: Mini-batch Estimation for Deep Cox Models: Statistical Foundations and Practical Guidance

Statistics > Machine Learning arXiv:2408.02839 (stat) [Submitted on 5 Aug 2024 (v1), last revised 30 Mar 2026 (this version, v5)] Title:Mini-batch Estimation for Deep Cox Models: Statistical Foundations and Practical Guidance Authors:Lang Zeng, Weijing Tang, Zhao Ren, Ying Ding View a PDF of the paper titled Mini-batch Estimation for Deep Cox Models: Statistical Foundations and Practical Guidance, by Lang Zeng and 3 other authors View PDF HTML (experimental) Abstract:The stochastic gradient descent (SGD) algorithm has been widely used to optimize deep Cox neural network (Cox-NN) by updating model parameters using mini-batches of data. We show that SGD aims to optimize the average of mini-batch partial-likelihood, which is different from the standard partial-likelihood. This distinction requires developing new statistical properties for the global optimizer, namely, the mini-batch maximum partial-likelihood estimator (mb-MPLE). We establish that mb-MPLE for Cox-NN is consistent and achieves the optimal minimax convergence rate up to a polylogarithmic factor. For Cox regression with linear covariate effects, we further show that mb-MPLE is $\sqrt{n}$-consistent and asymptotically normal with asymptotic variance approaching the information lower bound as batch size increases, which is confirmed by simulation studies. Additionally, we offer practical guidance on using SGD, supported by theoretical analysis and numerical evidence. For Cox-NN, we demonstrate that the ratio of th...

Originally published on March 31, 2026. Curated by AI News.

Machine Learning

Week 6 AIPass update - answering the top questions from last post (file conflicts, remote models, scale)

Followup to last post with answers to the top questions from the comments. Appreciate everyone who jumped in. The most common one by a mi...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

Honest ChatGPT vs Claude comparison after using both daily for a month

got tired of reading comparisons that were obvisously written by people who tested each tool for 20 minutes so i ran both at $20/month fo...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Machine Learning

What if attention didn’t need matrix multiplication?

I built a cognitive architecture where all computation reduces to three bit operations: XOR, MAJ, POPCNT. No GEMM. No GPU. No floating-po...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Machine Learning

WTF. Its real. AllBirds (the shoe company) is pivoting to inference.

I'm profoundly ambivalent re: how to feel about this; is it great -- what a scrappy, bold pivot! Or wildly dumb - its so far from their c...

Reddit - Artificial Intelligence · 1 min · about 4 hours ago

[2408.02839] Mini-batch Estimation for Deep Cox Models: Statistical Foundations and Practical Guidance

About this article

Related Articles

Week 6 AIPass update - answering the top questions from last post (file conflicts, remote models, scale)

Honest ChatGPT vs Claude comparison after using both daily for a month

What if attention didn’t need matrix multiplication?

WTF. Its real. AllBirds (the shoe company) is pivoting to inference.

No comments

Stay updated with AI News