[2603.19253] A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2

[2603.19253] A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2603.19253: A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2

Computer Science > Computation and Language arXiv:2603.19253 (cs) [Submitted on 25 Feb 2026] Title:A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2 Authors:Marcin Pietroń, Filip Gampel, Jakub Gomułka, Andrzej Tomski, Rafał Olszowski View a PDF of the paper titled A comprehensive study of LLM-based argument classification: from Llama through DeepSeek to GPT-5.2, by Marcin Pietro\'n and 4 other authors View PDF Abstract:Argument mining (AM) is an interdisciplinary research field focused on the automatic identification and classification of argumentative components, such as claims and premises, and the relationships between them. Recent advances in large language models (LLMs) have significantly improved the performance of argument classification compared to traditional machine learning approaches. This study presents a comprehensive evaluation of several state-of-the-art LLMs, including GPT-5.2, Llama 4, and DeepSeek, on large publicly available argument classification corpora such as this http URL and UKP. The evaluation incorporates advanced prompting strategies, including Chain-of- Thought prompting, prompt rephrasing, voting, and certainty-based classification. Both quantitative performance metrics and qualitative error analysis are conducted to assess model behavior. The best-performing model in the study (GPT-5.2) achieves a classification accuracy of 78.0% (UKP) and 91.9% (this http URL). The use of prompt rephrasing, ...

Originally published on March 23, 2026. Curated by AI News.

Related Articles

I Asked ChatGPT 500 Questions. Here Are the Ads I Saw Most Often | WIRED
Llms

I Asked ChatGPT 500 Questions. Here Are the Ads I Saw Most Often | WIRED

Ads are rolling out across the US on ChatGPT’s free tier. I asked OpenAI's bot 500 questions to see what these ads were like and how they...

Wired - AI · 9 min ·
Llms

Abacus.Ai Claw LLM consumes an incredible amount of credit without any usage :(

Three days ago, I clicked the "Deploy OpenClaw In Seconds" button to get an overview of the new service, but I didn't build any automatio...

Reddit - Artificial Intelligence · 1 min ·
Google’s Gemini AI app debuts in Hong Kong
Llms

Google’s Gemini AI app debuts in Hong Kong

Tech giant’s chatbot service tops Apple’s app store chart in the city.

AI Tools & Products · 2 min ·
Google Launches Gemini Import Tools to Poach Users From Rival AI Apps
Llms

Google Launches Gemini Import Tools to Poach Users From Rival AI Apps

Anyone looking to switch their AI assistant will find it surprisingly easy, as it only takes a few steps to move from A to B. This is not...

AI Tools & Products · 4 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime