[2602.12783] SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise

[2602.12783] SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise

arXiv - AI 4 min read Article

Summary

The paper introduces SQuTR, a new benchmark for evaluating the robustness of spoken query retrieval systems under various acoustic noise conditions, highlighting significant performance drops in existing models.

Why It Matters

As spoken query retrieval becomes increasingly important in information retrieval, understanding its robustness against real-world noise is critical. SQuTR addresses the gap in existing benchmarks, providing a comprehensive dataset and evaluation protocol that can guide future research and development in this area.

Key Takeaways

  • SQuTR includes 37,317 unique queries from various datasets.
  • The benchmark evaluates retrieval systems under controlled noise conditions.
  • Performance of retrieval models significantly declines as noise levels increase.
  • Robustness remains a critical challenge for large-scale retrieval models.
  • SQuTR facilitates reproducible evaluations and diagnostic analysis.

Computer Science > Information Retrieval arXiv:2602.12783 (cs) [Submitted on 13 Feb 2026] Title:SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise Authors:Yuejie Li, Ke Yang, Yueying Hua, Berlin Chen, Jianhao Nie, Yueping He, Caixin Kang View a PDF of the paper titled SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise, by Yuejie Li and 6 other authors View PDF HTML (experimental) Abstract:Spoken query retrieval is an important interaction mode in modern information retrieval. However, existing evaluation datasets are often limited to simple queries under constrained noise conditions, making them inadequate for assessing the robustness of spoken query retrieval systems under complex acoustic perturbations. To address this limitation, we present SQuTR, a robustness benchmark for spoken query retrieval that includes a large-scale dataset and a unified evaluation protocol. SQuTR aggregates 37,317 unique queries from six commonly used English and Chinese text retrieval datasets, spanning multiple domains and diverse query types. We synthesize speech using voice profiles from 200 real speakers and mix 17 categories of real-world environmental noise under controlled SNR levels, enabling reproducible robustness evaluation from quiet to highly noisy conditions. Under the unified protocol, we conduct large-scale evaluations on representative cascaded and end-to-end retrieval systems. Experimental results show that...

Related Articles

Nlp

McKinsey's AI Lie Explains What's Happening to Work

Everyone thinks McKinsey just built 25,000 AI experts. They didn't. They took a 35-year-old internal database, put a natural language int...

Reddit - Artificial Intelligence · 1 min ·
Generative Ai

Midjourney has a new offer on the cancel page there is 20 off for 2 months

submitted by /u/RainDragonfly826 [link] [comments]

Reddit - Artificial Intelligence · 1 min ·
Walmart CEO reportedly brags that company's in-app AI agent is making people spend 35% more money
Nlp

Walmart CEO reportedly brags that company's in-app AI agent is making people spend 35% more money

AI Tools & Products · 4 min ·
Llms

[R] Looking for arXiv cs.LG endorser, inference monitoring using information geometry

Hi r/MachineLearning, I’m looking for an arXiv endorser in cs.LG for a paper on inference-time distribution shift detection for deployed ...

Reddit - Machine Learning · 1 min ·
More in Nlp: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime