[2511.07989] State of the Art in Text Classification for South Slavic Languages: Fine-Tuning or Prompting?

[2511.07989] State of the Art in Text Classification for South Slavic Languages: Fine-Tuning or Prompting?

arXiv - AI 4 min read Article

Summary

This article evaluates the performance of language models in text classification tasks for South Slavic languages, comparing fine-tuned BERT-like models with LLMs in various domains.

Why It Matters

Understanding the effectiveness of different language models on less-resourced languages is crucial for advancing NLP applications in these regions. This research highlights the trade-offs between fine-tuning and prompting methods, providing insights that can guide future model development and deployment.

Key Takeaways

  • LLMs show strong zero-shot performance in text classification for South Slavic languages.
  • Fine-tuned BERT-like models remain practical for large-scale text annotation despite LLM advantages.
  • LLMs have drawbacks, including unpredictable outputs and higher computational costs.

Computer Science > Computation and Language arXiv:2511.07989 (cs) [Submitted on 11 Nov 2025 (v1), last revised 19 Feb 2026 (this version, v2)] Title:State of the Art in Text Classification for South Slavic Languages: Fine-Tuning or Prompting? Authors:Taja Kuzman Pungeršek, Peter Rupnik, Ivan Porupski, Vuk Dinić, Nikola Ljubešić View a PDF of the paper titled State of the Art in Text Classification for South Slavic Languages: Fine-Tuning or Prompting?, by Taja Kuzman Punger\v{s}ek and 4 other authors View PDF HTML (experimental) Abstract:Until recently, fine-tuned BERT-like models provided state-of-the-art performance on text classification tasks. With the rise of instruction-tuned decoder-only models, commonly known as large language models (LLMs), the field has increasingly moved toward zero-shot and few-shot prompting. However, the performance of LLMs on text classification, particularly on less-resourced languages, remains under-explored. In this paper, we evaluate the performance of current language models on text classification tasks across several South Slavic languages. We compare openly available fine-tuned BERT-like models with a selection of open-source and closed-source LLMs across three tasks in three domains: sentiment classification in parliamentary speeches, topic classification in news articles and parliamentary speeches, and genre identification in web texts. Our results show that LLMs demonstrate strong zero-shot performance, often matching or surpassing ...

Related Articles

Hackers Are Posting the Claude Code Leak With Bonus Malware | WIRED
Llms

Hackers Are Posting the Claude Code Leak With Bonus Malware | WIRED

Plus: The FBI says a recent hack of its wiretap tools poses a national security risk, attackers stole Cisco source code as part of an ong...

Wired - AI · 9 min ·
Llms

People anxious about deviating from what AI tells them to do?

My friend came over yesterday to dye her hair. She had asked ChatGPT for the 'correct' way to do it. Chat told her to dye the ends first,...

Reddit - Artificial Intelligence · 1 min ·
Llms

ChatGPT on trial: A landmark test of AI liability in the practice of law

AI Tools & Products ·
Llms

What if Claude purposefully made its own code leakable so that it would get leaked

What if Claude leaked itself by socially and architecturally engineering itself to be leaked by a dumb human submitted by /u/smurfcsgoawp...

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime