[2602.12937] Curriculum Learning and Pseudo-Labeling Improve the Generalization of Multi-Label Arabic Dialect Identification Models
Summary
This article presents a novel approach to Arabic Dialect Identification by framing it as a multi-label classification task, utilizing curriculum learning and pseudo-labeling to enhance model generalization.
Why It Matters
The research addresses a significant gap in Arabic Dialect Identification, which has traditionally relied on single-label datasets. By proposing a multi-label framework and improving dataset quality through innovative techniques, this work contributes to more accurate language processing in diverse Arabic dialects, which is crucial for applications in NLP and AI.
Key Takeaways
- Reframing Arabic Dialect Identification as a multi-label classification task improves accuracy.
- Curriculum learning strategies enhance model training by aligning with dialect complexity.
- The LAHJATBERT model achieved a macro F1 score of 0.69, outperforming previous systems.
Computer Science > Computation and Language arXiv:2602.12937 (cs) [Submitted on 12 Feb 2026] Title:Curriculum Learning and Pseudo-Labeling Improve the Generalization of Multi-Label Arabic Dialect Identification Models Authors:Ali Mekky, Mohamed El Zeftawy, Lara Hassan, Amr Keleg, Preslav Nakov View a PDF of the paper titled Curriculum Learning and Pseudo-Labeling Improve the Generalization of Multi-Label Arabic Dialect Identification Models, by Ali Mekky and 4 other authors View PDF Abstract:Being modeled as a single-label classification task for a long time, recent work has argued that Arabic Dialect Identification (ADI) should be framed as a multi-label classification task. However, ADI remains constrained by the availability of single-label datasets, with no large-scale multi-label resources available for training. By analyzing models trained on single-label ADI data, we show that the main difficulty in repurposing such datasets for Multi-Label Arabic Dialect Identification (MLADI) lies in the selection of negative samples, as many sentences treated as negative could be acceptable in multiple dialects. To address these issues, we construct a multi-label dataset by generating automatic multi-label annotations using GPT-4o and binary dialect acceptability classifiers, with aggregation guided by the Arabic Level of Dialectness (ALDi). Afterward, we train a BERT-based multi-label classifier using curriculum learning strategies aligned with dialectal complexity and label car...