[2501.18731] Evaluating Spoken Language as a Biomarker for Automated Screening of Cognitive Impairment
About this article
Abstract page for arXiv paper 2501.18731: Evaluating Spoken Language as a Biomarker for Automated Screening of Cognitive Impairment
Computer Science > Machine Learning arXiv:2501.18731 (cs) [Submitted on 30 Jan 2025 (v1), last revised 2 Mar 2026 (this version, v2)] Title:Evaluating Spoken Language as a Biomarker for Automated Screening of Cognitive Impairment Authors:Maria R. Lima, Alexander Capstick, Fatemeh Geranmayeh, Ramin Nilforooshan, Maja Matarić, Ravi Vaidyanathan, Payam Barnaghi View a PDF of the paper titled Evaluating Spoken Language as a Biomarker for Automated Screening of Cognitive Impairment, by Maria R. Lima and 6 other authors View PDF Abstract:Timely and accurate assessment of cognitive impairment remains a major unmet need. Speech biomarkers offer a scalable, non-invasive, cost-effective solution for automated screening. However, the clinical utility of machine learning (ML) remains limited by interpretability and generalisability to real-world speech datasets. We evaluate explainable ML for screening of Alzheimer's disease and related dementias (ADRD) and severity prediction using benchmark DementiaBank speech (N = 291, 64% female, 69.8 (SD = 8.6) years). We validate generalisability on pilot data collected in-residence (N = 22, 59% female, 76.2 (SD = 8.0) years). To enhance clinical utility, we stratify risk for actionable triage and assess linguistic feature importance. We show that a Random Forest trained on linguistic features for ADRD detection achieves a mean sensitivity of 69.4% (95% confidence interval (CI) = 66.4-72.5) and specificity of 83.3% (78.0-88.7). On pilot data, th...