[2603.04707] Detection of Illicit Content on Online Marketplaces using Large Language Models
About this article
Abstract page for arXiv paper 2603.04707: Detection of Illicit Content on Online Marketplaces using Large Language Models
Computer Science > Computation and Language arXiv:2603.04707 (cs) [Submitted on 5 Mar 2026] Title:Detection of Illicit Content on Online Marketplaces using Large Language Models Authors:Quoc Khoa Tran, Thanh Thi Nguyen, Campbell Wilson View a PDF of the paper titled Detection of Illicit Content on Online Marketplaces using Large Language Models, by Quoc Khoa Tran and 2 other authors View PDF HTML (experimental) Abstract:Online marketplaces, while revolutionizing global commerce, have inadvertently facilitated the proliferation of illicit activities, including drug trafficking, counterfeit sales, and cybercrimes. Traditional content moderation methods such as manual reviews and rule-based automated systems struggle with scalability, dynamic obfuscation techniques, and multilingual content. Conventional machine learning models, though effective in simpler contexts, often falter when confronting the semantic complexities and linguistic nuances characteristic of illicit marketplace communications. This research investigates the efficacy of Large Language Models (LLMs), specifically Meta's Llama 3.2 and Google's Gemma 3, in detecting and classifying illicit online marketplace content using the multilingual DUTA10K dataset. Employing fine-tuning techniques such as Parameter-Efficient Fine-Tuning (PEFT) and quantization, these models were systematically benchmarked against a foundational transformer-based model (BERT) and traditional machine learning baselines (Support Vector Mac...