[2603.27223] EuraGovExam: A Multilingual Multimodal Benchmark from Real-World Civil Service Exams
About this article
Abstract page for arXiv paper 2603.27223: EuraGovExam: A Multilingual Multimodal Benchmark from Real-World Civil Service Exams
Computer Science > Computer Vision and Pattern Recognition arXiv:2603.27223 (cs) [Submitted on 28 Mar 2026] Title:EuraGovExam: A Multilingual Multimodal Benchmark from Real-World Civil Service Exams Authors:JaeSeong Kim, Chaehwan Lim, Sang Hyun Gil, Suan Lee View a PDF of the paper titled EuraGovExam: A Multilingual Multimodal Benchmark from Real-World Civil Service Exams, by JaeSeong Kim and 3 other authors View PDF HTML (experimental) Abstract:We present EuraGovExam, a multilingual and multimodal benchmark sourced from real-world civil service examinations across five representative Eurasian regions: South Korea, Japan, Taiwan, India, and the European Union. Designed to reflect the authentic complexity of public-sector assessments, the dataset contains over 8,000 high-resolution scanned multiple-choice questions covering 17 diverse academic and administrative domains. Unlike existing benchmarks, EuraGovExam embeds all question content--including problem statements, answer choices, and visual elements--within a single image, providing only a minimal standardized instruction for answer formatting. This design demands that models perform layout-aware, cross-lingual reasoning directly from visual input. All items are drawn from real exam documents, preserving rich visual structures such as tables, multilingual typography, and form-like layouts. Evaluation results show that even state-of-the-art vision-language models (VLMs) achieve only 86% accuracy, underscoring the benchma...