[2603.01945] When Numbers Tell Half the Story: Human-Metric Alignment in Topic Model Evaluation
About this article
Abstract page for arXiv paper 2603.01945: When Numbers Tell Half the Story: Human-Metric Alignment in Topic Model Evaluation
Computer Science > Computation and Language arXiv:2603.01945 (cs) [Submitted on 2 Mar 2026] Title:When Numbers Tell Half the Story: Human-Metric Alignment in Topic Model Evaluation Authors:Thibault Prouteau, Francis Lareau, Nicolas Dugué, Jean-Charles Lamirel, Christophe Malaterre View a PDF of the paper titled When Numbers Tell Half the Story: Human-Metric Alignment in Topic Model Evaluation, by Thibault Prouteau and 4 other authors View PDF HTML (experimental) Abstract:Topic models uncover latent thematic structures in text corpora, yet evaluating their quality remains challenging, particularly in specialized domains. Existing methods often rely on automated metrics like topic coherence and diversity, which may not fully align with human judgment. Human evaluation tasks, such as word intrusion, provide valuable insights but are costly and primarily validated on general-domain corpora. This paper introduces Topic Word Mixing (TWM), a novel human evaluation task assessing inter-topic distinctness by testing whether annotators can distinguish between word sets from single or mixed topics. TWM complements word intrusion's focus on intra-topic coherence and provides a human-grounded counterpart to diversity metrics. We evaluate six topic models - both statistical and embedding-based (LDA, NMF, Top2Vec, BERTopic, CFMF, CFMF-emb) - comparing automated metrics with human evaluation methods based on nearly 4,000 annotations from a domain-specific corpus of philosophy of science p...