[2505.06046] Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information
About this article
Abstract page for arXiv paper 2505.06046: Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information
Computer Science > Computation and Language arXiv:2505.06046 (cs) [Submitted on 9 May 2025 (v1), last revised 4 Mar 2026 (this version, v3)] Title:Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information Authors:Joshua Harris, Fan Grayson, Felix Feldman, Timothy Laurence, Toby Nonnenmacher, Oliver Higgins, Leo Loman, Selina Patel, Thomas Finnie, Samuel Collins, Michael Borowitz View a PDF of the paper titled Healthy LLMs? Benchmarking LLM Knowledge of UK Government Public Health Information, by Joshua Harris and 9 other authors View PDF HTML (experimental) Abstract:As Large Language Models (LLMs) become widely accessible, a detailed understanding of their knowledge within specific domains becomes necessary for successful real world use. This is particularly critical in the domains of medicine and public health, where failure to retrieve relevant, accurate, and current information could significantly impact UK residents. However, while there are a number of LLM benchmarks in the medical domain, currently little is known about LLM knowledge within the field of public health. To address this issue, this paper introduces a new benchmark, PubHealthBench, with over 8000 questions for evaluating LLMs' Multiple Choice Question Answering (MCQA) and free form responses to public health queries. To create PubHealthBench we extract free text from 687 current UK government guidance documents and implement an automated pipeline for generating MCQA samples. Ass...