[2206.02088] LOCO Feature Importance Inference without Data Splitting via Minipatch Ensembles
About this article
Abstract page for arXiv paper 2206.02088: LOCO Feature Importance Inference without Data Splitting via Minipatch Ensembles
Statistics > Machine Learning arXiv:2206.02088 (stat) [Submitted on 5 Jun 2022 (v1), last revised 23 Mar 2026 (this version, v3)] Title:LOCO Feature Importance Inference without Data Splitting via Minipatch Ensembles Authors:Luqin Gan, Lili Zheng, Genevera I. Allen View a PDF of the paper titled LOCO Feature Importance Inference without Data Splitting via Minipatch Ensembles, by Luqin Gan and 2 other authors View PDF Abstract:Feature importance inference is critical for the interpretability and reliability of machine learning models. There has been increasing interest in developing model-agnostic approaches to interpret any predictive model, often in the form of feature occlusion or leave-one-covariate-out (LOCO) inference. Existing methods typically make limiting distributional assumptions, modeling assumptions, and require data splitting. In this work, we develop a novel, mostly model-agnostic, and distribution-free inference framework for feature importance in regression or classification tasks that does not require data splitting. Our approach leverages a form of random observation and feature subsampling called minipatch ensembles; it utilizes the trained ensembles for inference and requires no model-refitting or held-out test data after training. We show that our approach enjoys both computational and statistical efficiency as well as circumvents interpretational challenges with data splitting. Further, despite using the same data for training and inference, we show ...