[2602.22101] On Imbalanced Regression with Hoeffding Trees

[2602.22101] On Imbalanced Regression with Hoeffding Trees

arXiv - Machine Learning 4 min read Article

Summary

This paper explores the application of Hoeffding trees for imbalanced regression tasks, highlighting the effectiveness of kernel density estimation and hierarchical shrinkage in streaming data environments.

Why It Matters

Imbalanced regression is a common challenge in machine learning, particularly in real-time data applications. This research provides insights into enhancing decision tree performance in such scenarios, which is crucial for developing robust predictive models in various fields.

Key Takeaways

  • Hoeffding trees are effective for regression tasks in streaming data.
  • Kernel density estimation (KDE) improves performance in early data streams.
  • Hierarchical shrinkage (HS) shows limited benefits for decision trees in this context.
  • The study extends existing methods for better handling of imbalanced data.
  • Publicly available code supports further research and application.

Computer Science > Machine Learning arXiv:2602.22101 (cs) [Submitted on 25 Feb 2026] Title:On Imbalanced Regression with Hoeffding Trees Authors:Pantia-Marina Alchirch, Dimitrios I. Diochnos View a PDF of the paper titled On Imbalanced Regression with Hoeffding Trees, by Pantia-Marina Alchirch and 1 other authors View PDF HTML (experimental) Abstract:Many real-world applications provide a continuous stream of data that is subsequently used by machine learning models to solve regression tasks of interest. Hoeffding trees and their variants have a long-standing tradition due to their effectiveness, either alone or as base models in broader ensembles. At the same time a recent line of work in batch learning has shown that kernel density estimation (KDE) is an effective approach for smoothed predictions in imbalanced regression tasks [Yang et al., 2021]. Moreover, another recent line of work for batch learning, called hierarchical shrinkage (HS) [Agarwal et al., 2022], has introduced a post-hoc regularization method for decision trees that does not alter the structure of the learned tree. Using a telescoping argument we cast KDE to streaming environments and extend the implementation of HS to incremental decision tree models. Armed with these extensions we investigate the performance of decision trees that may enjoy such options in datasets commonly used for regression in online settings. We conclude that KDE is beneficial in the early parts of the stream, while HS hardly, if ...

Related Articles

Machine Learning

[D] When to transition from simple heuristics to ML models (e.g., DensityFunction)?

Two questions: What are the recommendations around when to transition from a simple heuristic baseline to machine learning ML models for ...

Reddit - Machine Learning · 1 min ·
Machine Learning

[D] ICML 2026 Average Score

Hi all, I’m curious about the current review dynamics for ICML 2026, especially after the rebuttal phase. For those who are reviewers (or...

Reddit - Machine Learning · 1 min ·
Machine Learning

[R] VOID: Video Object and Interaction Deletion (physically-consistent video inpainting)

We present VOID, a model for video object removal that aims to handle *physical interactions*, not just appearance. Most existing video i...

Reddit - Machine Learning · 1 min ·
Machine Learning

FLUX 2 Pro (2026) Sketch to Image

I sketched a cow and tested how different models interpret it into a realistic image for downstream 3D generation, turns out some models ...

Reddit - Artificial Intelligence · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime