[2602.12287] Retrieval-Augmented Self-Taught Reasoning Model with Adaptive Chain-of-Thought for ASR Named Entity Correction
Summary
This article presents a novel retrieval-augmented reasoning model designed to enhance named entity correction in automatic speech recognition systems, achieving significant error rate reductions.
Why It Matters
Named entity recognition is critical for the accuracy of automatic speech recognition (ASR) systems, especially in specialized domains. This research addresses existing limitations in current methods by leveraging advanced reasoning capabilities of large language models, potentially improving ASR applications across various industries.
Key Takeaways
- Introduces a retrieval-augmented generation framework for ASR named entity correction.
- Combines a rephrasing language model with a phonetic-level candidate retrieval method.
- Features a self-taught reasoning model that adapts its reasoning depth based on task complexity.
- Demonstrates a 17.96% and 34.42% reduction in character error rates on two datasets.
- Highlights the importance of sophisticated reasoning in improving ASR accuracy.
Computer Science > Computation and Language arXiv:2602.12287 (cs) [Submitted on 21 Jan 2026] Title:Retrieval-Augmented Self-Taught Reasoning Model with Adaptive Chain-of-Thought for ASR Named Entity Correction Authors:Junjie An, Jingguang Tian, Tianyi Wang, Yu Gao, Xiaofeng Mou, Yi Xu View a PDF of the paper titled Retrieval-Augmented Self-Taught Reasoning Model with Adaptive Chain-of-Thought for ASR Named Entity Correction, by Junjie An and 5 other authors View PDF HTML (experimental) Abstract:End-to-end automatic speech recognition (ASR) systems frequently misrecognize domain-specific phrases like named entities, which can cause catastrophic failures in downstream tasks. A new family of named entity correction methods based on large language models (LLMs) has recently emerged. However, these approaches have yet to fully exploit the sophisticated reasoning capabilities inherent to LLMs. To bridge this gap, we propose a novel retrieval-augmented generation framework for correcting named entity errors in ASR. Our approach consists of two key components: (1) a rephrasing language model (RLM) for named entity recognition, followed by candidate retrieval using a phonetic-level edit distance; and (2) a novel self-taught reasoning model with adaptive chain-of-thought (A-STAR) that dynamically adjusts the depth of its reasoning based on task difficulty. Experiments on the AISHELL-1 and Homophone datasets demonstrate the effectiveness of our method, which achieves relative reducti...