Machine Learning Nlp Ai Infrastructure Ai Safety Data Science

[2602.18201] SOMtime the World Ain$'$t Fair: Violating Fairness Using Self-Organizing Maps

arXiv - Machine Learning February 23, 2026 4 min read Article

Summary

The paper explores the limitations of unsupervised learning methods, specifically Self-Organizing Maps (SOMs), in maintaining fairness by demonstrating that sensitive attributes can emerge in representations even when excluded from training data.

Why It Matters

This research highlights critical flaws in the assumption that unsupervised learning can achieve fairness through unawareness. It emphasizes the need for fairness auditing in machine learning pipelines, particularly in unsupervised contexts, which is increasingly relevant as AI systems are deployed in sensitive areas.

Key Takeaways

Sensitive attributes can emerge in unsupervised embeddings despite exclusion from training data.
SOMtime outperforms traditional methods like PCA and t-SNE in recovering sensitive attribute correlations.
Demographically skewed clusters in unsupervised segmentation pose fairness risks.
Fairness auditing must extend to unsupervised components of machine learning.
The findings challenge the notion of 'fairness through unawareness' in AI.

Computer Science > Artificial Intelligence arXiv:2602.18201 (cs) [Submitted on 20 Feb 2026] Title:SOMtime the World Ain$'$t Fair: Violating Fairness Using Self-Organizing Maps Authors:Joseph Bingham, Netanel Arussy, Dvir Aran View a PDF of the paper titled SOMtime the World Ain$'$t Fair: Violating Fairness Using Self-Organizing Maps, by Joseph Bingham and Netanel Arussy and Dvir Aran View PDF HTML (experimental) Abstract:Unsupervised representations are widely assumed to be neutral with respect to sensitive attributes when those attributes are withheld from training. We show that this assumption is false. Using SOMtime, a topology-preserving representation method based on high-capacity Self-Organizing Maps, we demonstrate that sensitive attributes such as age and income emerge as dominant latent axes in purely unsupervised embeddings, even when explicitly excluded from the input. On two large-scale real-world datasets (the World Values Survey across five countries and the Census-Income dataset), SOMtime recovers monotonic orderings aligned with withheld sensitive attributes, achieving Spearman correlations of up to 0.85, whereas PCA and UMAP typically remain below 0.23 (with a single exception reaching 0.31), and against t-SNE and autoencoders which achieve at most 0.34. Furthermore, unsupervised segmentation of SOMtime embeddings produces demographically skewed clusters, demonstrating downstream fairness risks without any supervised task. These findings establish that \te...

Read Original Article

[2602.18201] SOMtime the World Ain$'$t Fair: Violating Fairness Using Self-Organizing Maps

Summary

Why It Matters

Key Takeaways

Related Articles

UMKC Announces New Master of Science in Artificial Intelligence

Your prompts aren’t the problem — something else is

[R], 31 MILLIONS High frequency data, Light GBM worked perfectly

[D] Those of you with 10+ years in ML — what is the public completely wrong about?

No comments

Stay updated with AI News