[2601.12248] AQUA-Bench: Beyond Finding Answers to Knowing When There

[2601.12248] AQUA-Bench: Beyond Finding Answers to Knowing When There Are None in Audio Question Answering

arXiv - Machine Learning April 29, 2026 4 min read

About this article

Abstract page for arXiv paper 2601.12248: AQUA-Bench: Beyond Finding Answers to Knowing When There Are None in Audio Question Answering

Electrical Engineering and Systems Science > Audio and Speech Processing arXiv:2601.12248 (eess) [Submitted on 18 Jan 2026 (v1), last revised 28 Apr 2026 (this version, v2)] Title:AQUA-Bench: Beyond Finding Answers to Knowing When There Are None in Audio Question Answering Authors:Chun-Yi Kuan, Hung-yi Lee View a PDF of the paper titled AQUA-Bench: Beyond Finding Answers to Knowing When There Are None in Audio Question Answering, by Chun-Yi Kuan and 1 other authors View PDF HTML (experimental) Abstract:Recent advances in audio-aware large language models have shown strong performance on audio question answering. However, existing benchmarks mainly cover answerable questions and overlook the challenge of unanswerable ones, where no reliable answer can be inferred from the audio. Such cases are common in real-world settings, where questions may be misleading, ill-posed, or incompatible with the information. To address this gap, we present AQUA-Bench, a benchmark for Audio Question Unanswerability Assessment. It systematically evaluates three scenarios: Absent Answer Detection (the correct option is missing), Incompatible Answer Set Detection (choices are categorically mismatched with the question), and Incompatible Audio Question Detection (the question is irrelevant or lacks sufficient grounding in the audio). By assessing these cases, AQUA-Bench offers a rigorous measure of model reliability and promotes the development of audio-language systems that are more robust and tr...

Originally published on April 29, 2026. Curated by AI News.

Llms

[2604.16909] PRISM: Probing Reasoning, Instruction, and Source Memory in LLM Hallucinations

Abstract page for arXiv paper 2604.16909: PRISM: Probing Reasoning, Instruction, and Source Memory in LLM Hallucinations

arXiv - AI · 4 min · about 1 hour ago

Llms

[2604.07802] Latent Anomaly Knowledge Excavation: Unveiling Sparse Sensitive Neurons in Vision-Language Models

Abstract page for arXiv paper 2604.07802: Latent Anomaly Knowledge Excavation: Unveiling Sparse Sensitive Neurons in Vision-Language Models

arXiv - AI · 4 min · about 1 hour ago

Llms

[2602.07605] Fine-R1: Make Multi-modal LLMs Excel in Fine-Grained Visual Recognition by Chain-of-Thought Reasoning

Abstract page for arXiv paper 2602.07605: Fine-R1: Make Multi-modal LLMs Excel in Fine-Grained Visual Recognition by Chain-of-Thought Rea...

arXiv - AI · 4 min · about 1 hour ago

Llms

[2602.07096] RealFin: How Well Do LLMs Reason About Finance When Users Leave Things Unsaid?

Abstract page for arXiv paper 2602.07096: RealFin: How Well Do LLMs Reason About Finance When Users Leave Things Unsaid?

arXiv - AI · 3 min · about 1 hour ago

[2601.12248] AQUA-Bench: Beyond Finding Answers to Knowing When There Are None in Audio Question Answering

About this article

Related Articles

[2604.16909] PRISM: Probing Reasoning, Instruction, and Source Memory in LLM Hallucinations

[2604.07802] Latent Anomaly Knowledge Excavation: Unveiling Sparse Sensitive Neurons in Vision-Language Models

[2602.07605] Fine-R1: Make Multi-modal LLMs Excel in Fine-Grained Visual Recognition by Chain-of-Thought Reasoning

[2602.07096] RealFin: How Well Do LLMs Reason About Finance When Users Leave Things Unsaid?

No comments

Stay updated with AI News