[2603.22346] First-Mover Bias in Gradient Boosting Explanations: Mechanism, Detection, and Resolution
About this article
Abstract page for arXiv paper 2603.22346: First-Mover Bias in Gradient Boosting Explanations: Mechanism, Detection, and Resolution
Computer Science > Machine Learning arXiv:2603.22346 (cs) [Submitted on 22 Mar 2026] Title:First-Mover Bias in Gradient Boosting Explanations: Mechanism, Detection, and Resolution Authors:Drake Caraker, Bryan Arnold, David Rhoads View a PDF of the paper titled First-Mover Bias in Gradient Boosting Explanations: Mechanism, Detection, and Resolution, by Drake Caraker and 2 other authors View PDF HTML (experimental) Abstract:We isolate and empirically characterize first-mover bias -- a path-dependent concentration of feature importance caused by sequential residual fitting in gradient boosting -- as a specific mechanistic cause of the well-known instability of SHAP-based feature rankings under multicollinearity. When correlated features compete for early splits, gradient boosting creates a self-reinforcing advantage for whichever feature is selected first: subsequent trees inherit modified residuals that favor the incumbent, concentrating SHAP importance on an arbitrary feature rather than distributing it across the correlated group. Scaling up a single model amplifies this effect -- a Large Single Model with the same total tree count as our method produces the worst explanations of any approach tested. We demonstrate that model independence is sufficient to resolve first-mover bias in the linear regime, and remains the most effective mitigation under nonlinear data-generating processes. Both our proposed method, DASH (Diversified Aggregation of SHAP), and simple seed-averagi...