[2604.06863] Digital Skin, Digital Bias: Uncovering Tone-Based Biases in LLMs and Emoji Embeddings
About this article
Abstract page for arXiv paper 2604.06863: Digital Skin, Digital Bias: Uncovering Tone-Based Biases in LLMs and Emoji Embeddings
Computer Science > Social and Information Networks arXiv:2604.06863 (cs) [Submitted on 8 Apr 2026] Title:Digital Skin, Digital Bias: Uncovering Tone-Based Biases in LLMs and Emoji Embeddings Authors:Mingchen Li, Wajdi Aljedaani, Yingjie Liu, Navyasri Meka, Xuan Lu, Xinyue Ye, Junhua Ding, Yunhe Feng View a PDF of the paper titled Digital Skin, Digital Bias: Uncovering Tone-Based Biases in LLMs and Emoji Embeddings, by Mingchen Li and 7 other authors View PDF HTML (experimental) Abstract:Skin-toned emojis are crucial for fostering personal identity and social inclusion in online communication. As AI models, particularly Large Language Models (LLMs), increasingly mediate interactions on web platforms, the risk that these systems perpetuate societal biases through their representation of such symbols is a significant concern. This paper presents the first large-scale comparative study of bias in skin-toned emoji representations across two distinct model classes. We systematically evaluate dedicated emoji embedding models (emoji2vec, emoji-sw2v) against four modern LLMs (Llama, Gemma, Qwen, and Mistral). Our analysis first reveals a critical performance gap: while LLMs demonstrate robust support for skin tone modifiers, widely-used specialized emoji models exhibit severe deficiencies. More importantly, a multi-faceted investigation into semantic consistency, representational similarity, sentiment polarity, and core biases uncovers systemic disparities. We find evidence of skew...