[2507.03043] K-Function: Joint Pronunciation Transcription and Feedback for Evaluating Kids Language Function

[2507.03043] K-Function: Joint Pronunciation Transcription and Feedback for Evaluating Kids Language Function

arXiv - AI 4 min read Article

Summary

The K-Function framework enhances children's language evaluation by integrating precise phoneme transcription with LLM-driven scoring, improving assessment accuracy significantly.

Why It Matters

This research addresses the challenges of evaluating children's language skills, particularly in the context of automatic speech recognition. By improving phoneme recognition and assessment frameworks, it opens avenues for scalable language screening, crucial for early childhood development.

Key Takeaways

  • K-Function combines sub-word transcription with LLM scoring for children's language evaluation.
  • The Kids-Weighted Finite State Transducer (K-WFST) achieves significant improvements in phoneme error rates.
  • High-quality transcripts enable accurate grading of verbal skills and developmental milestones.
  • The framework supports scalable language screening for children, enhancing early detection of language issues.
  • Results align closely with human evaluators, validating the effectiveness of the approach.

Computer Science > Computation and Language arXiv:2507.03043 (cs) [Submitted on 3 Jul 2025 (v1), last revised 24 Feb 2026 (this version, v3)] Title:K-Function: Joint Pronunciation Transcription and Feedback for Evaluating Kids Language Function Authors:Shuhe Li, Chenxu Guo, Jiachen Lian, Cheol Jun Cho, Wenshuo Zhao, Xiner Xu, Ruiyu Jin, Xiaoyu Shi, Xuanru Zhou, Dingkun Zhou, Sam Wang, Grace Wang, Jingze Yang, Jingyi Xu, Ruohan Bao, Xingrui Chen, Elise Brenner, Brandon In, Francesca Pei, Maria Luisa Gorno-Tempini, Gopala Anumanchipalli View a PDF of the paper titled K-Function: Joint Pronunciation Transcription and Feedback for Evaluating Kids Language Function, by Shuhe Li and 20 other authors View PDF HTML (experimental) Abstract:Evaluating young children's language is challenging for automatic speech recognizers due to high-pitched voices, prolonged sounds, and limited data. We introduce K-Function, a framework that combines accurate sub-word transcription with objective, Large Language Model (LLM)-driven scoring. Its core, Kids-Weighted Finite State Transducer (K-WFST), merges an acoustic phoneme encoder with a phoneme-similarity model to capture child-specific speech errors while remaining fully interpretable. K-WFST achieves a 1.39 % phoneme error rate on MyST and 8.61 % on Multitudes-an absolute improvement of 10.47 % and 7.06 % over a greedy-search decoder. These high-quality transcripts are used by an LLM to grade verbal skills, developmental milestones, reading, a...

Related Articles

Llms

"Oops! ChatGPT is Temporarily Unavailable!": A Diary Study on Knowledge Workers' Experiences of LLM Withdrawal

submitted by /u/Special-Steel [link] [comments]

Reddit - Artificial Intelligence · 1 min ·
Llms

I built a Star Trek LCARS terminal that reads your entire AI coding setup

Side project that got out of hand. It's a dashboard for Claude Code that scans your ~/.claude/ directory and renders everything as a TNG ...

Reddit - Artificial Intelligence · 1 min ·
Llms

[R] Is autoresearch really better than classic hyperparameter tuning?

We did experiments comparing Optuna & autoresearch. Autoresearch converges faster, is more cost-efficient, and even generalizes bette...

Reddit - Machine Learning · 1 min ·
Llms

Claude Source Code?

Has anyone been able to successfully download the leaked source code yet? I've not been able to find it. If anyone has, please reach out....

Reddit - Artificial Intelligence · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime