[2605.00056] Smart Ensemble Learning Framework for Predicting Groundwater Heavy Metal Pollution
About this article
Abstract page for arXiv paper 2605.00056: Smart Ensemble Learning Framework for Predicting Groundwater Heavy Metal Pollution
Computer Science > Machine Learning arXiv:2605.00056 (cs) [Submitted on 29 Apr 2026] Title:Smart Ensemble Learning Framework for Predicting Groundwater Heavy Metal Pollution Authors:T. Ansah-Narh, G. Y. Afrifa, J. B. Tandoh, K. Asare, M. Addi, K. E. Yorke, D. M. A. Akpoley, K. Aidoo, S. K. Fosuhene View a PDF of the paper titled Smart Ensemble Learning Framework for Predicting Groundwater Heavy Metal Pollution, by T. Ansah-Narh and 8 other authors View PDF HTML (experimental) Abstract:Groundwater in the Densu Basin is increasingly threatened by heavy metal contamination, but conventional methods fail to capture the statistical complexity and spatial heterogeneity of pollution indicators. A key challenge is modelling the Heavy Metal Pollution Index (HPI), which is typically skewed and affected by correlated contaminants, leading to biased predictions without transformation. This study develops a predictive framework integrating response transformations with nested cross-validated ensemble machine learning. Three transformations (raw, log, and Gaussian copula) were applied to HPI and evaluated across six learners: support vector regression (SVM), $k$-nearest neighbours (k-NN), CART, Elastic Net, kernel ridge regression, and a stacked Lasso ensemble. Raw-scale models produced deceptively high fits (Elastic Net and stacked ensemble $R^2 \approx 1.0$), suggesting over-optimism. The log transformation stabilised variance (SVM: $R^2 = 0.93$, RMSE $= 0.18$; k-NN: $R^2 = 0.92$, RMS...