[2602.21165] PVminer: A Domain-Specific Tool to Detect the Patient Voice in Patient Generated Data

[2602.21165] PVminer: A Domain-Specific Tool to Detect the Patient Voice in Patient Generated Data

arXiv - AI 4 min read Article

Summary

PVminer is a novel NLP framework designed to detect the patient voice in patient-generated data, improving the analysis of patient-provider communication.

Why It Matters

This research addresses the challenges of analyzing large volumes of patient-generated text, which is crucial for understanding patient perspectives and improving healthcare delivery. By integrating advanced NLP techniques, PVminer enhances the ability to extract meaningful insights from patient communications, potentially transforming patient-centered care.

Key Takeaways

  • PVminer utilizes a domain-adapted NLP framework to analyze patient-generated text.
  • The tool achieves high performance in detecting patient voice with F1 scores exceeding 80%.
  • It integrates patient-specific BERT encoders and unsupervised topic modeling for enhanced semantic understanding.
  • PVminer addresses the limitations of traditional qualitative coding frameworks in healthcare.
  • The source code and annotated datasets will be publicly available for further research.

Computer Science > Computation and Language arXiv:2602.21165 (cs) [Submitted on 24 Feb 2026] Title:PVminer: A Domain-Specific Tool to Detect the Patient Voice in Patient Generated Data Authors:Samah Fodeh, Linhai Ma, Yan Wang, Srivani Talakokkul, Ganesh Puthiaraju, Afshan Khan, Ashley Hagaman, Sarah Lowe, Aimee Roundtree View a PDF of the paper titled PVminer: A Domain-Specific Tool to Detect the Patient Voice in Patient Generated Data, by Samah Fodeh and 8 other authors View PDF HTML (experimental) Abstract:Patient-generated text such as secure messages, surveys, and interviews contains rich expressions of the patient voice (PV), reflecting communicative behaviors and social determinants of health (SDoH). Traditional qualitative coding frameworks are labor intensive and do not scale to large volumes of patient-authored messages across health systems. Existing machine learning (ML) and natural language processing (NLP) approaches provide partial solutions but often treat patient-centered communication (PCC) and SDoH as separate tasks or rely on models not well suited to patient-facing language. We introduce PVminer, a domain-adapted NLP framework for structuring patient voice in secure patient-provider communication. PVminer formulates PV detection as a multi-label, multi-class prediction task integrating patient-specific BERT encoders (PV-BERT-base and PV-BERT-large), unsupervised topic modeling for thematic augmentation (PV-Topic-BERT), and fine-tuned classifiers for Cod...

Related Articles

UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Machine Learning

[D] Physicist-turned-ML-engineer looking to get into ML research. What's worth working on and where can I contribute most?

After years of focus on building products, I'm carving out time to do independent research again and trying to find the right direction. ...

Reddit - Machine Learning · 1 min ·
PSA: Anyone with a link can view your Granola notes by default | The Verge
Machine Learning

PSA: Anyone with a link can view your Granola notes by default | The Verge

Granola, the AI-powered note-taking app, makes your notes viewable by anyone with a link by default. It also turns on AI training for any...

The Verge - AI · 5 min ·
Machine Learning

[D] On-Device Real-Time Visibility Restoration: Deterministic CV vs. Quantized ML Models. Looking for insights on Edge Preservation vs. Latency.

Hey everyone, We have been working on a real-time camera engine for iOS that currently uses a purely deterministic Computer Vision approa...

Reddit - Machine Learning · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime