[2603.02475] Large-Scale Dataset and Benchmark for Skin Tone Classification in the Wild
About this article
Abstract page for arXiv paper 2603.02475: Large-Scale Dataset and Benchmark for Skin Tone Classification in the Wild
Computer Science > Computer Vision and Pattern Recognition arXiv:2603.02475 (cs) [Submitted on 2 Mar 2026] Title:Large-Scale Dataset and Benchmark for Skin Tone Classification in the Wild Authors:Vitor Pereira Matias, Márcus Vinícius Lobo Costa, João Batista Neto, Tiago Novello de Brito View a PDF of the paper titled Large-Scale Dataset and Benchmark for Skin Tone Classification in the Wild, by Vitor Pereira Matias and 3 other authors View PDF HTML (experimental) Abstract:Deep learning models often inherit biases from their training data. While fairness across gender and ethnicity is well-studied, fine-grained skin tone analysis remains a challenge due to the lack of granular, annotated datasets. Existing methods often rely on the medical 6-tone Fitzpatrick scale, which lacks visual representativeness, or use small, private datasets that prevent reproducibility, or often rely on classic computer vision pipelines, with a few using deep learning. They overlook issues like train-test leakage and dataset imbalance, and are limited by small or unavailable datasets. In this work, we present a comprehensive framework for skin tone fairness. First, we introduce the STW, a large-scale, open-access dataset comprising 42,313 images from 3,564 individuals, labeled using the 10-tone MST scale. Second, we benchmark both Classic Computer Vision (SkinToneCCV) and Deep Learning approaches, demonstrating that classic models provide near-random results, while deep learning reaches nearly ann...