[2602.05088] VERA-MH: Reliability and Validity of an Open-Source AI Safety Evaluation in Mental Health
Summary
The article presents VERA-MH, an open-source evaluation tool designed to assess the safety of AI in mental health contexts, focusing on suicide risk detection and response.
Why It Matters
As generative AI chatbots gain popularity for mental health support, ensuring their safety is crucial. VERA-MH provides a standardized method to evaluate AI interactions, which is vital for protecting users and enhancing the reliability of AI in sensitive applications.
Key Takeaways
- VERA-MH is an automated tool for evaluating AI safety in mental health.
- The study found strong inter-rater reliability among clinicians assessing AI chatbot behaviors.
- The LLM judge showed high alignment with clinical consensus, supporting VERA-MH's validity.
- Future research will expand VERA-MH's framework to cover more AI safety aspects.
- The tool addresses urgent safety concerns as AI chatbots become more prevalent in mental health.
Computer Science > Artificial Intelligence arXiv:2602.05088 (cs) [Submitted on 4 Feb 2026 (v1), last revised 17 Feb 2026 (this version, v3)] Title:VERA-MH: Reliability and Validity of an Open-Source AI Safety Evaluation in Mental Health Authors:Kate H. Bentley, Luca Belli, Adam M. Chekroud, Emily J. Ward, Emily R. Dworkin, Emily Van Ark, Kelly M. Johnston, Will Alexander, Millard Brown, Matt Hawrilenko View a PDF of the paper titled VERA-MH: Reliability and Validity of an Open-Source AI Safety Evaluation in Mental Health, by Kate H. Bentley and 9 other authors View PDF Abstract:Millions now use generative AI chatbots for psychological support. Despite the promise related to availability and scale, the single most pressing question in AI for mental health is whether these tools are safe. The Validation of Ethical and Responsible AI in Mental Health (VERA-MH) evaluation was recently proposed to meet the urgent need for an evidence-based, automated safety benchmark. This study aimed to examine the clinical validity and reliability of VERA-MH for evaluating AI safety in suicide risk detection and response. We first simulated a large set of conversations between large language model (LLM)-based users (user-agents) and general-purpose AI chatbots. Licensed mental health clinicians used a rubric (scoring guide) to independently rate the simulated conversations for safe and unsafe chatbot behaviors, as well as user-agent realism. An LLM-based judge used the same scoring rubric to ...