[2602.20967] Training-Free Intelligibility-Guided Observation Addition for Noisy ASR
Summary
This paper presents a novel training-free method for improving automatic speech recognition (ASR) in noisy environments by using intelligibility-guided observation addition.
Why It Matters
The proposed method addresses the significant challenge of ASR performance degradation in noisy settings, offering a solution that enhances intelligibility without the need for extensive training. This has implications for various applications, including voice recognition technology in real-world environments.
Key Takeaways
- Introduces a training-free method for ASR enhancement in noisy conditions.
- Utilizes intelligibility estimates from ASR to guide observation addition.
- Demonstrates improved robustness and performance over existing methods.
- Reduces complexity and enhances generalization in ASR applications.
- Provides extensive experimental validation across diverse datasets.
Electrical Engineering and Systems Science > Audio and Speech Processing arXiv:2602.20967 (eess) [Submitted on 24 Feb 2026] Title:Training-Free Intelligibility-Guided Observation Addition for Noisy ASR Authors:Haoyang Li, Changsong Liu, Wei Rao, Hao Shi, Sakriani Sakti, Eng Siong Chng View a PDF of the paper titled Training-Free Intelligibility-Guided Observation Addition for Noisy ASR, by Haoyang Li and 5 other authors View PDF HTML (experimental) Abstract:Automatic speech recognition (ASR) degrades severely in noisy environments. Although speech enhancement (SE) front-ends effectively suppress background noise, they often introduce artifacts that harm recognition. Observation addition (OA) addressed this issue by fusing noisy and SE enhanced speech, improving recognition without modifying the parameters of the SE or ASR models. This paper proposes an intelligibility-guided OA method, where fusion weights are derived from intelligibility estimates obtained directly from the backend ASR. Unlike prior OA methods based on trained neural predictors, the proposed method is training-free, reducing complexity and enhances generalization. Extensive experiments across diverse SE-ASR combinations and datasets demonstrate strong robustness and improvements over existing OA baselines. Additional analyses of intelligibility-guided switching-based alternatives and frame versus utterance-level OA further validate the proposed design. Subjects: Audio and Speech Processing (eess.AS); Arti...