[2602.16241] Are LLMs Ready to Replace Bangla Annotators?

[2602.16241] Are LLMs Ready to Replace Bangla Annotators?

arXiv - AI 3 min read Article

Summary

This article evaluates the effectiveness of Large Language Models (LLMs) as annotators for Bangla hate speech, revealing significant biases and inconsistencies in their performance compared to smaller, task-aligned models.

Why It Matters

Understanding the limitations of LLMs in low-resource languages like Bangla is crucial for developing reliable AI systems. This research highlights the need for careful evaluation of AI tools in sensitive contexts, which is essential for ethical AI deployment and ensuring fair outcomes.

Key Takeaways

  • LLMs exhibit significant annotator bias and instability in judgments.
  • Larger models do not necessarily provide better annotation quality than smaller, task-specific models.
  • The study emphasizes the importance of evaluating AI tools before deployment in sensitive tasks.
  • Current LLMs may not be suitable for low-resource languages without further refinement.
  • Understanding model behavior is critical for ethical AI applications.

Computer Science > Computation and Language arXiv:2602.16241 (cs) [Submitted on 18 Feb 2026] Title:Are LLMs Ready to Replace Bangla Annotators? Authors:Md. Najib Hasan, Touseef Hasan, Souvika Sarkar View a PDF of the paper titled Are LLMs Ready to Replace Bangla Annotators?, by Md. Najib Hasan and 1 other authors View PDF Abstract:Large Language Models (LLMs) are increasingly used as automated annotators to scale dataset creation, yet their reliability as unbiased annotators--especially for low-resource and identity-sensitive settings--remains poorly understood. In this work, we study the behavior of LLMs as zero-shot annotators for Bangla hate speech, a task where even human agreement is challenging, and annotator bias can have serious downstream consequences. We conduct a systematic benchmark of 17 LLMs using a unified evaluation framework. Our analysis uncovers annotator bias and substantial instability in model judgments. Surprisingly, increased model scale does not guarantee improved annotation quality--smaller, more task-aligned models frequently exhibit more consistent behavior than their larger counterparts. These results highlight important limitations of current LLMs for sensitive annotation tasks in low-resource languages and underscore the need for careful evaluation before deployment. Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI) Cite as: arXiv:2602.16241 [cs.CL]   (or arXiv:2602.16241v1 [cs.CL] for this version)   https://doi.org...

Related Articles

I let Gemini in Google Maps plan my day and it went surprisingly well | The Verge
Llms

I let Gemini in Google Maps plan my day and it went surprisingly well | The Verge

Gemini in Google Maps is a surprisingly useful way to explore new territory.

The Verge - AI · 11 min ·
Llms

The person who replaces you probably won't be AI. It'll be someone from the next department over who learned to use it - opinion/discussion

I'm a strategy person by background. Two years ago I'd write a recommendation and hand it to a product team. Now.. I describe what I want...

Reddit - Artificial Intelligence · 1 min ·
Block Resets Management With AI As Cash App Adds Installment Transfers
Llms

Block Resets Management With AI As Cash App Adds Installment Transfers

Block (NYSE:XYZ) plans a permanent organizational overhaul that replaces many middle management roles with AI-driven models to create fla...

AI Tools & Products · 5 min ·
Anthropic leaks source code for its AI coding agent Claude
Llms

Anthropic leaks source code for its AI coding agent Claude

Anthropic accidentally exposed roughly 512,000 lines of proprietary TypeScript source code for its AI-powered coding agent Claude Code

AI Tools & Products · 3 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime