[2602.23940] Benchmarking BERT-based Models for Sentence-level Topic

[2602.23940] Benchmarking BERT-based Models for Sentence-level Topic Classification in Nepali Language

arXiv - Machine Learning March 02, 2026 3 min read

About this article

Abstract page for arXiv paper 2602.23940: Benchmarking BERT-based Models for Sentence-level Topic Classification in Nepali Language

Computer Science > Computation and Language arXiv:2602.23940 (cs) [Submitted on 27 Feb 2026] Title:Benchmarking BERT-based Models for Sentence-level Topic Classification in Nepali Language Authors:Nischal Karki, Bipesh Subedi, Prakash Poudyal, Rupak Raj Ghimire, Bal Krishna Bal View a PDF of the paper titled Benchmarking BERT-based Models for Sentence-level Topic Classification in Nepali Language, by Nischal Karki and 4 other authors View PDF HTML (experimental) Abstract:Transformer-based models such as BERT have significantly advanced Natural Language Processing (NLP) across many languages. However, Nepali, a low-resource language written in Devanagari script, remains relatively underexplored. This study benchmarks multilingual, Indic, Hindi, and Nepali BERT variants to evaluate their effectiveness in Nepali topic classification. Ten pre-trained models, including mBERT, XLM-R, MuRIL, DevBERT, HindiBERT, IndicBERT, and NepBERTa, were fine-tuned and tested on the balanced Nepali dataset containing 25,006 sentences across five conceptual domains and the performance was evaluated using accuracy, weighted precision, recall, F1-score, and AUROC metrics. The results reveal that Indic models, particularly MuRIL-large, achieved the highest F1-score of 90.60%, outperforming multilingual and monolingual models. NepBERTa also performed competitively with an F1-score of 88.26%. Overall, these findings establish a robust baseline for future document-level classification and broader Nep...

Originally published on March 02, 2026. Curated by AI News.

Machine Learning

[R] Literature on optimizing user feedback in the form of Thumbs up/ Thumbs down?

I am working in a project where I have a dataset of model responses tagged with "thumbs up" or "thumbs down" by the user. That's all the ...

Reddit - Machine Learning · 1 min · about 2 hours ago

Machine Learning

Diffusion-based AI model successfully trained in electroplating

Electrochemical deposition, or electroplating, is a common industrial technique that coats materials to improve corrosion resistance and ...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Machine Learning

AI model can detect multiple cognitive brain diseases from a single blood sample

The symptom profiles of different neurodegenerative diseases often overlap, and diagnosing age-related cognitive symptoms is complex. A p...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Machine Learning

[P] Federated Adversarial Learning

I'm a CS/ML engineering student in my 4th year, and I need help for a project I recently got assigned to (as an "end of the year" project...

Reddit - Machine Learning · 1 min · about 4 hours ago

[2602.23940] Benchmarking BERT-based Models for Sentence-level Topic Classification in Nepali Language

About this article

Related Articles

[R] Literature on optimizing user feedback in the form of Thumbs up/ Thumbs down?

Diffusion-based AI model successfully trained in electroplating

AI model can detect multiple cognitive brain diseases from a single blood sample

[P] Federated Adversarial Learning

No comments

Stay updated with AI News