[2503.07313] The influence of missing data mechanisms and simple missing data handling techniques on fairness

[2503.07313] The influence of missing data mechanisms and simple missing data handling techniques on fairness

arXiv - Machine Learning 4 min read Article

Summary

This article explores how different missing data mechanisms and handling techniques affect the fairness of machine learning algorithms, revealing that listwise deletion generally yields the highest fairness across various classification methods.

Why It Matters

Understanding the impact of missing data on algorithmic fairness is crucial as machine learning systems increasingly influence decision-making in various sectors. This research highlights the importance of selecting appropriate data handling techniques to mitigate bias and enhance fairness in AI applications.

Key Takeaways

  • Missing data mechanisms can influence the fairness of machine learning algorithms.
  • Listwise deletion often provides the highest fairness among handling techniques.
  • Random forests tend to achieve the highest fairness across classification algorithms.
  • The interaction between data handling techniques and algorithms is significant.
  • Limited research exists on the implications of missing data on algorithmic fairness.

Statistics > Machine Learning arXiv:2503.07313 (stat) [Submitted on 10 Mar 2025 (v1), last revised 19 Feb 2026 (this version, v2)] Title:The influence of missing data mechanisms and simple missing data handling techniques on fairness Authors:Aeysha Bhatti, Trudie Sandrock, Johane Nienkemper-Swanepoel View a PDF of the paper titled The influence of missing data mechanisms and simple missing data handling techniques on fairness, by Aeysha Bhatti and 2 other authors View PDF HTML (experimental) Abstract:Machine learning algorithms permeate the day-to-day aspects of our lives and therefore studying the fairness of these algorithms before implementation is crucial. One way in which bias can manifest in a dataset is through missing values. Missing data are often assumed to be missing completely randomly; in reality the propensity of data being missing is often tied to the demographic characteristics of individuals. There is limited research into how missing values and the handling thereof can impact the fairness of an algorithm. Most researchers either apply listwise deletion or tend to use simpler methods of imputation (e.g. mean or mode) compared to more advanced approaches (e.g. multiple imputation). This study considers the fairness of various classification algorithms after a range of missing data handling strategies is applied. Missing values are generated (i.e. amputed) in three popular datasets for classification fairness, by creating a high percentage of missing values ...

Related Articles

Improving AI models’ ability to explain their predictions
Machine Learning

Improving AI models’ ability to explain their predictions

AI News - General · 9 min ·
Machine Learning

Auto agent - Self improving domain expertise agent

someone opensource an ai agent that autonomously upgraded itself to #1 across multiple domains in < 24 hours…. then open sourced the e...

Reddit - Artificial Intelligence · 1 min ·
UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Tuskegee University to host the 2026 Amazon Web Services–Machine Learning University Research & Teaching Symposium
Machine Learning

Tuskegee University to host the 2026 Amazon Web Services–Machine Learning University Research & Teaching Symposium

Tuskegee University will host the 2026 Amazon Web Services–Machine Learning University Spring AI/ML Teaching & Research Symposium on Febr...

AI News - General · 8 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime