[2604.03939] Fused Multinomial Logistic Regression Utilizing Summary-Level External Machine-learning Information
About this article
Abstract page for arXiv paper 2604.03939: Fused Multinomial Logistic Regression Utilizing Summary-Level External Machine-learning Information
Statistics > Methodology arXiv:2604.03939 (stat) [Submitted on 5 Apr 2026] Title:Fused Multinomial Logistic Regression Utilizing Summary-Level External Machine-learning Information Authors:Chi-Shian Dai, Jun Shao View a PDF of the paper titled Fused Multinomial Logistic Regression Utilizing Summary-Level External Machine-learning Information, by Chi-Shian Dai and Jun Shao View PDF HTML (experimental) Abstract:In many modern applications, a carefully designed primary study provides individual-level data for interpretable modeling, while summary-level external information is available through black-box, efficient, and nonparametric machine-learning predictions. Although summary-level external information has been studied in the data integration literature, there is limited methodology for leveraging external nonparametric machine-learning predictions to improve statistical inference in the primary study. We propose a general empirical-likelihood framework that incorporates external predictions through moment constraints. An advantage of nonparametric machine-learning prediction is that it induces a rich class of valid moment restrictions that remain robust to covariate shift under a mild overlap condition without requiring explicit density-ratio modeling. We focus on multinomial logistic regression as the primary model and address common data-quality issues in external sources, including coarsened outcomes, partially observed covariates, covariate shift, and heterogeneity in...