[2603.05057] MUTEX: Leveraging Multilingual Transformers and Conditional Random Fields for Enhanced Urdu Toxic Span Detection

[2603.05057] MUTEX: Leveraging Multilingual Transformers and Conditional Random Fields for Enhanced Urdu Toxic Span Detection

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2603.05057: MUTEX: Leveraging Multilingual Transformers and Conditional Random Fields for Enhanced Urdu Toxic Span Detection

Computer Science > Computation and Language arXiv:2603.05057 (cs) [Submitted on 5 Mar 2026] Title:MUTEX: Leveraging Multilingual Transformers and Conditional Random Fields for Enhanced Urdu Toxic Span Detection Authors:Inayat Arshad, Fajar Saleem, Ijaz Hussain View a PDF of the paper titled MUTEX: Leveraging Multilingual Transformers and Conditional Random Fields for Enhanced Urdu Toxic Span Detection, by Inayat Arshad and 2 other authors View PDF HTML (experimental) Abstract:Urdu toxic span detection remains limited because most existing systems rely on sentence-level classification and fail to identify the specific toxic spans within those text. It is further exacerbated by the multiple factors i.e. lack of token-level annotated resources, linguistic complexity of Urdu, frequent code-switching, informal expressions, and rich morphological variations. In this research, we propose MUTEX: a multilingual transformer combined with conditional random fields (CRF) for Urdu toxic span detection framework that uses manually annotated token-level toxic span dataset to improve performance and interpretability. MUTEX uses XLM RoBERTa with CRF layer to perform sequence labeling and is tested on multi-domain data extracted from social media, online news, and YouTube reviews using token-level F1 to evaluate fine-grained span detection. The results indicate that MUTEX achieves 60% token-level F1 score that is the first supervised baseline for Urdu toxic span detection. Further examinati...

Originally published on March 06, 2026. Curated by AI News.

Related Articles

Machine Learning

[P] Unix philosophy for ML pipelines: modular, swappable stages with typed contracts

We built an open-source prototype that applies Unix philosophy to retrieval pipelines. Each stage (PII redaction, chunking, dedup, embedd...

Reddit - Machine Learning · 1 min ·
Machine Learning

Making an AI native sovereign computational stack

I’ve been working on a personal project that ended up becoming a kind of full computing stack: identity / trust protocol decentralized ch...

Reddit - Artificial Intelligence · 1 min ·
Llms

An attack class that passes every current LLM filter - no payload, no injection signature, no log trace

https://shapingrooms.com/research I published a paper today on something I've been calling postural manipulation. The short version: ordi...

Reddit - Artificial Intelligence · 1 min ·
Machine Learning

What tools are sr MLEs using? (clawdbot, openspec, wispr) [D]

I'm already blasting cursor, but I want to level up my output. I heard that these kind of AI tools and workflows are being asked in SF. W...

Reddit - Machine Learning · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime