[2601.18857] Statistical Inference for Explainable Boosting Machines
About this article
Abstract page for arXiv paper 2601.18857: Statistical Inference for Explainable Boosting Machines
Statistics > Machine Learning arXiv:2601.18857 (stat) [Submitted on 26 Jan 2026 (v1), last revised 30 Mar 2026 (this version, v2)] Title:Statistical Inference for Explainable Boosting Machines Authors:Haimo Fang, Kevin Tan, Jonathan Pipping-Gamon, Giles Hooker View a PDF of the paper titled Statistical Inference for Explainable Boosting Machines, by Haimo Fang and 3 other authors View PDF HTML (experimental) Abstract:Explainable boosting machines (EBMs) are popular "glass-box" models that learn a set of univariate functions using boosting trees. These achieve explainability through visualizations of each feature's effect. However, unlike linear model coefficients, uncertainty quantification for the learned univariate functions requires computationally intensive bootstrapping, making it hard to know which features truly matter. We provide an alternative using recent advances in statistical inference for gradient boosting, deriving methods for statistical inference as well as end-to-end theoretical guarantees. Using a moving average instead of a sum of trees (Boulevard regularization) allows the boosting process to converge to a feature-wise kernel ridge regression. This produces asymptotically normal predictions that achieve the minimax-optimal MSE for fitting Lipschitz GAMs with $p$ features of $O(p n^{-2/3})$, successfully avoiding the curse of dimensionality. We then construct prediction intervals for the response and confidence intervals for each learned univariate func...