[2507.21807] MIBoost: A Gradient Boosting Algorithm for Variable Selection After Multiple Imputation

[2507.21807] MIBoost: A Gradient Boosting Algorithm for Variable Selection After Multiple Imputation

arXiv - Machine Learning 4 min read Article

Summary

MIBoost introduces a novel gradient boosting algorithm for variable selection after multiple imputation, addressing challenges in model selection with missing data.

Why It Matters

This research is significant as it provides a solution to the common problem of missing data in predictive modeling. By enhancing variable selection methods, MIBoost could improve the accuracy of predictions in various fields, making it a valuable tool for statisticians and data scientists dealing with incomplete datasets.

Key Takeaways

  • MIBoost offers a unified variable-selection mechanism across multiple imputed datasets.
  • The algorithm extends existing methods like LASSO and elastic nets to gradient boosting.
  • Simulation studies indicate MIBoost achieves comparable predictive performance to other advanced methods.
  • Addressing missing data effectively can enhance model reliability and insights.
  • The research contributes to ongoing discussions about optimal model selection techniques.

Statistics > Machine Learning arXiv:2507.21807 (stat) [Submitted on 29 Jul 2025 (v1), last revised 23 Feb 2026 (this version, v5)] Title:MIBoost: A Gradient Boosting Algorithm for Variable Selection After Multiple Imputation Authors:Robert Kuchen View a PDF of the paper titled MIBoost: A Gradient Boosting Algorithm for Variable Selection After Multiple Imputation, by Robert Kuchen View PDF HTML (experimental) Abstract:Statistical learning methods for automated variable selection, such as LASSO, elastic nets, or gradient boosting, have become increasingly popular tools for building powerful prediction models. Yet, in practice, analyses are often complicated by missing data. The most widely used approach to address missingness is multiple imputation, which involves creating several completed datasets. However, there is an ongoing debate on how to perform model selection in the presence of multiple imputed datasets. Simple strategies, such as pooling models across datasets, have been shown to have suboptimal properties. Although more sophisticated methods exist, they are often difficult to implement and therefore not widely applied. In contrast, two recent approaches modify the regularization methods LASSO and elastic nets by defining a single loss function, resulting in a unified set of coefficients across imputations. Our key contribution is to extend this principle to the framework of component-wise gradient boosting by proposing MIBoost, a novel algorithm that employs a u...

Related Articles

Llms

Claude Opus 4.6 API at 40% below Anthropic pricing – try free before you pay anything

Hey everyone I've set up a self-hosted API gateway using [New-API](QuantumNous/new-ap) to manage and distribute Claude Opus 4.6 access ac...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

[D] ICML reviewer making up false claim in acknowledgement, what to do?

In a rebuttal acknowledgement we received, the reviewer made up a claim that our method performs worse than baselines with some hyperpara...

Reddit - Machine Learning · 1 min ·
UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Machine Learning

[D] Budget Machine Learning Hardware

Looking to get into machine learning and found this video on a piece of hardware for less than £500. Is it really possible to teach auton...

Reddit - Machine Learning · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime