Llms Machine Learning Data Science Ai Startups Nlp Generative Ai

[2602.15373] Far Out: Evaluating Language Models on Slang in Australian and Indian English

arXiv - AI February 18, 2026 4 min read Article

Summary

This paper evaluates the performance of language models on slang in Australian and Indian English, revealing significant gaps in understanding non-standard language varieties.

Why It Matters

Understanding how language models handle slang is crucial for improving their effectiveness in diverse linguistic contexts. This research highlights the need for better model training on variety-specific language, which is essential for applications in natural language processing and AI development.

Key Takeaways

Language models show performance gaps in understanding slang.
Australian English slang is less accurately processed than Indian English slang.
Models perform better on real-world data compared to synthetically generated examples.
Target word selection tasks yield higher accuracy than prediction tasks.
The study underscores the importance of training models on diverse language varieties.

Computer Science > Computation and Language arXiv:2602.15373 (cs) [Submitted on 17 Feb 2026] Title:Far Out: Evaluating Language Models on Slang in Australian and Indian English Authors:Deniz Kaya Dilsiz, Dipankar Srirag, Aditya Joshi View a PDF of the paper titled Far Out: Evaluating Language Models on Slang in Australian and Indian English, by Deniz Kaya Dilsiz and 2 other authors View PDF HTML (experimental) Abstract:Language models exhibit systematic performance gaps when processing text in non-standard language varieties, yet their ability to comprehend variety-specific slang remains underexplored for several languages. We present a comprehensive evaluation of slang awareness in Indian English (en-IN) and Australian English (en-AU) across seven state-of-the-art language models. We construct two complementary datasets: \textsc{web}, containing 377 web-sourced usage examples from Urban Dictionary, and \textsc{gen}, featuring 1,492 synthetically generated usages of these slang terms, across diverse scenarios. We assess language models on three tasks: target word prediction (TWP), guided target word prediction (TWP$^*$) and target word selection (TWS). Our results reveal four key findings: (1) Higher average model performance TWS versus TWP and TWP$^*$, with average accuracy score increasing from 0.03 to 0.49 respectively (2) Stronger average model performance on \textsc{web} versus \textsc{gen} datasets, with average similarity score increasing by 0.03 and 0.05 across TWP...

Read Original Article

[2602.15373] Far Out: Evaluating Language Models on Slang in Australian and Indian English

Summary

Why It Matters

Key Takeaways

Related Articles

People anxious about deviating from what AI tells them to do?

What if Claude purposefully made its own code leakable so that it would get leaked

Observer-Embedded Reality

I think we’re about to have a new kind of “SEO”… and nobody is talking about it.

No comments

Stay updated with AI News