Machine Learning Ai Safety Data Science

[2312.12715] Learning Performance Maximizing Ensembles with Explainability Guarantees

arXiv - Machine Learning February 23, 2026 3 min read Article

Summary

This paper presents a method for optimizing the allocation of observations between explainable and black box models, aiming to maximize ensemble performance while ensuring high explainability levels.

Why It Matters

The research addresses a critical challenge in machine learning: balancing model performance with explainability. As AI systems become more integrated into decision-making processes, ensuring that models are both effective and interpretable is essential for trust and accountability.

Key Takeaways

Proposes a method for optimal observation allocation between explainable and black box models.
Achieves high ensemble performance while maintaining explainability for 74% of observations on average.
Demonstrates that the proposed method can outperform both individual models in certain scenarios.

Statistics > Machine Learning arXiv:2312.12715 (stat) [Submitted on 20 Dec 2023 (v1), last revised 20 Feb 2026 (this version, v3)] Title:Learning Performance Maximizing Ensembles with Explainability Guarantees Authors:Vincent Pisztora, Jia Li View a PDF of the paper titled Learning Performance Maximizing Ensembles with Explainability Guarantees, by Vincent Pisztora and 1 other authors View PDF HTML (experimental) Abstract:In this paper we propose a method for the optimal allocation of observations between an intrinsically explainable glass box model and a black box model. An optimal allocation being defined as one which, for any given explainability level (i.e. the proportion of observations for which the explainable model is the prediction function), maximizes the performance of the ensemble on the underlying task, and maximizes performance of the explainable model on the observations allocated to it, subject to the maximal ensemble performance condition. The proposed method is shown to produce such explainability optimal allocations on a benchmark suite of tabular datasets across a variety of explainable and black box model types. These learned allocations are found to consistently maintain ensemble performance at very high explainability levels (explaining $74\%$ of observations on average), and in some cases even outperforming both the component explainable and black box models while improving explainability. Subjects: Machine Learning (stat.ML); Machine Learning (cs.L...

Read Original Article