[2603.05231] Boosting ASR Robustness via Test-Time Reinforcement

[2603.05231] Boosting ASR Robustness via Test-Time Reinforcement Learning with Audio-Text Semantic Rewards

arXiv - Machine Learning March 06, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.05231: Boosting ASR Robustness via Test-Time Reinforcement Learning with Audio-Text Semantic Rewards

Computer Science > Sound arXiv:2603.05231 (cs) [Submitted on 5 Mar 2026] Title:Boosting ASR Robustness via Test-Time Reinforcement Learning with Audio-Text Semantic Rewards Authors:Linghan Fang, Tianxin Xie, Li Liu View a PDF of the paper titled Boosting ASR Robustness via Test-Time Reinforcement Learning with Audio-Text Semantic Rewards, by Linghan Fang and 1 other authors View PDF HTML (experimental) Abstract:Recently, Automatic Speech Recognition (ASR) systems (e.g., Whisper) have achieved remarkable accuracy improvements but remain highly sensitive to real-world unseen data (data with large distribution shifts), including noisy environments and diverse accents. To address this issue, test-time adaptation (TTA) has shown great potential in improving the model adaptability at inference time without ground-truth labels, and existing TTA methods often rely on pseudo-labeling or entropy minimization. However, by treating model confidence as a learning signal, these methods may reinforce high-confidence errors, leading to confirmation bias that undermines adaptation. To overcome these limitations, we present ASR-TRA, a novel Test-time Reinforcement Adaptation framework inspired by causal intervention. More precisely, our method introduces a learnable decoder prompt and utilizes temperature-controlled stochastic decoding to generate diverse transcription candidates. These are scored by a reward model that measures audio-text semantic alignment, and the resulting feedback is u...

Originally published on March 06, 2026. Curated by AI News.

Llms

World models will be the next big thing, bye-bye LLMs

Was at Nvidia's GTC conference recently and honestly, it was one of the most eye-opening events I've attended in a while. There was a lot...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Machine Learning

[D] Got my first offer after months of searching — below posted range, contract-to-hire, and worried it may pause my search. Do I take it?

I could really use some outside perspective. I’m a senior ML/CV engineer in Canada with about 5–6 years across research and industry. Mas...

Reddit - Machine Learning · 1 min · about 2 hours ago

Machine Learning

[Research] AI training is bad, so I started an research

Hello, I started researching about AI training Q:Why? R: Because AI training is bad right now. Q: What do you mean its bad? R: Like when ...

Reddit - Machine Learning · 1 min · about 2 hours ago

Machine Learning

[P] Unix philosophy for ML pipelines: modular, swappable stages with typed contracts

We built an open-source prototype that applies Unix philosophy to retrieval pipelines. Each stage (PII redaction, chunking, dedup, embedd...

Reddit - Machine Learning · 1 min · about 5 hours ago

[2603.05231] Boosting ASR Robustness via Test-Time Reinforcement Learning with Audio-Text Semantic Rewards

About this article

Related Articles

World models will be the next big thing, bye-bye LLMs

[D] Got my first offer after months of searching — below posted range, contract-to-hire, and worried it may pause my search. Do I take it?

[Research] AI training is bad, so I started an research

[P] Unix philosophy for ML pipelines: modular, swappable stages with typed contracts

No comments

Stay updated with AI News