[2603.21389] Task-Specific Efficiency Analysis: When Small Language Models Outperform Large Language Models
About this article
Abstract page for arXiv paper 2603.21389: Task-Specific Efficiency Analysis: When Small Language Models Outperform Large Language Models
Computer Science > Computation and Language arXiv:2603.21389 (cs) [Submitted on 22 Mar 2026] Title:Task-Specific Efficiency Analysis: When Small Language Models Outperform Large Language Models Authors:Jinghan Cao, Yu Ma, Xinjin Li, Qingyang Ren, Xiangyun Chen View a PDF of the paper titled Task-Specific Efficiency Analysis: When Small Language Models Outperform Large Language Models, by Jinghan Cao and 4 other authors View PDF HTML (experimental) Abstract:Large Language Models achieve remarkable performance but incur substantial computational costs unsuitable for resource-constrained deployments. This paper presents the first comprehensive task-specific efficiency analysis comparing 16 language models across five diverse NLP tasks. We introduce the Performance-Efficiency Ratio (PER), a novel metric integrating accuracy, throughput, memory, and latency through geometric mean normalization. Our systematic evaluation reveals that small models (0.5--3B parameters) achieve superior PER scores across all given tasks. These findings establish quantitative foundations for deploying small models in production environments prioritizing inference efficiency over marginal accuracy gains. Comments: Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG) Cite as: arXiv:2603.21389 [cs.CL] (or arXiv:2603.21389v1 [cs.CL] for this version) https://doi.org/10.48550/arXiv.2603.21389 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submission history F...