[2603.26516] ALBA: A European Portuguese Benchmark for Evaluating Language and Linguistic Dimensions in Generative LLMs
About this article
Abstract page for arXiv paper 2603.26516: ALBA: A European Portuguese Benchmark for Evaluating Language and Linguistic Dimensions in Generative LLMs
Computer Science > Computation and Language arXiv:2603.26516 (cs) [Submitted on 27 Mar 2026] Title:ALBA: A European Portuguese Benchmark for Evaluating Language and Linguistic Dimensions in Generative LLMs Authors:Inês Vieira, Inês Calvo, Iago Paulo, James Furtado, Rafael Ferreira, Diogo Tavares, Diogo Glória-Silva, David Semedo, João Magalhães View a PDF of the paper titled ALBA: A European Portuguese Benchmark for Evaluating Language and Linguistic Dimensions in Generative LLMs, by In\^es Vieira and 8 other authors View PDF HTML (experimental) Abstract:As Large Language Models (LLMs) expand across multilingual domains, evaluating their performance in under-represented languages becomes increasingly important. European Portuguese (pt-PT) is particularly affected, as existing training data and benchmarks are mainly in Brazilian Portuguese (pt-BR). To address this, we introduce ALBA, a linguistically grounded benchmark designed from the ground up to assess LLM proficiency in linguistic-related tasks in pt-PT across eight linguistic dimensions, including Language Variety, Culture-bound Semantics, Discourse Analysis, Word Plays, Syntax, Morphology, Lexicology, and Phonetics and Phonology. ALBA is manually constructed by language experts and paired with an LLM-as-a-judge framework for scalable evaluation of pt-PT generated language. Experiments on a diverse set of models reveal performance variability across linguistic dimensions, highlighting the need for comprehensive, varie...