[2512.20352] Multi-LLM Thematic Analysis with Dual Reliability Metrics: Combining Cohen's Kappa and Semantic Similarity for Qualitative Research Validation

[2512.20352] Multi-LLM Thematic Analysis with Dual Reliability Metrics: Combining Cohen's Kappa and Semantic Similarity for Qualitative Research Validation

arXiv - AI 4 min read Article

Summary

This paper presents a novel framework for validating qualitative research using multi-LLM thematic analysis, integrating Cohen's Kappa and semantic similarity metrics to enhance reliability.

Why It Matters

The study addresses the reliability challenges in qualitative research, particularly in the context of AI-assisted analysis. By combining traditional metrics with modern LLM capabilities, it offers a scalable and efficient solution for researchers, potentially transforming qualitative methodologies.

Key Takeaways

  • Introduces a dual reliability metrics framework for qualitative research validation.
  • Demonstrates high reliability in thematic analysis using three leading LLMs.
  • Provides an open-source implementation for researchers to utilize.
  • Enhances traditional qualitative methods with AI capabilities.
  • Establishes a methodological foundation for AI-assisted qualitative research.

Computer Science > Computation and Language arXiv:2512.20352 (cs) [Submitted on 23 Dec 2025 (v1), last revised 14 Feb 2026 (this version, v2)] Title:Multi-LLM Thematic Analysis with Dual Reliability Metrics: Combining Cohen's Kappa and Semantic Similarity for Qualitative Research Validation Authors:Nilesh Jain, Hyungil Suh, Seyi Adeyinka, Leor Roseman, Aza Allsop View a PDF of the paper titled Multi-LLM Thematic Analysis with Dual Reliability Metrics: Combining Cohen's Kappa and Semantic Similarity for Qualitative Research Validation, by Nilesh Jain and 4 other authors View PDF HTML (experimental) Abstract:Qualitative research faces a critical reliability challenge: traditional inter-rater agreement methods require multiple human coders, are time-intensive, and often yield moderate consistency. We present a multi-perspective validation framework for LLM-based thematic analysis that combines ensemble validation with dual reliability metrics: Cohen's Kappa ($\kappa$) for inter-rater agreement and cosine similarity for semantic consistency. Our framework enables configurable analysis parameters (1-6 seeds, temperature 0.0-2.0), supports custom prompt structures with variable substitution, and provides consensus theme extraction across any JSON format. As proof-of-concept, we evaluate three leading LLMs (Gemini 2.5 Pro, GPT-4o, Claude 3.5 Sonnet) on a psychedelic art therapy interview transcript, conducting six independent runs per model. Results demonstrate Gemini achieves hi...

Related Articles

ChatGPT has a new $100 per month Pro subscription | The Verge
Llms

ChatGPT has a new $100 per month Pro subscription | The Verge

OpenAI has announced a new version of its ChatGPT Pro subscription that costs $100 per month. The new Pro tier offers “5x more” usage of ...

The Verge - AI · 4 min ·
ChatGPT finally offers $100/month Pro plan | TechCrunch
Llms

ChatGPT finally offers $100/month Pro plan | TechCrunch

OpenAI announced on Thursday something that power users have been asking for: a $100/month plan. Previously, subscriptions jumped from $2...

TechCrunch - AI · 4 min ·
Florida AG announces investigation into OpenAI over shooting that allegedly involved ChatGPT | TechCrunch
Llms

Florida AG announces investigation into OpenAI over shooting that allegedly involved ChatGPT | TechCrunch

ChatGPT had reportedly been used to plan the attack that killed two and injured five at Florida State University last April. The family o...

TechCrunch - AI · 4 min ·
Llms

We’re open-sourcing a 33-benchmark diagnostic for AI alignment gaps, launches April 27

On April 27 we’re open-sourcing a free diagnostic tool called iFixAi. You run it against your AI system (agent, copilot, LLM integration,...

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime