[2603.23057] Prompt Amplification and Zero-Shot Late Fusion in Audio-Language Models for Speech Emotion Recognition

[2603.23057] Prompt Amplification and Zero-Shot Late Fusion in Audio-Language Models for Speech Emotion Recognition

arXiv - Machine Learning 3 min read

About this article

Abstract page for arXiv paper 2603.23057: Prompt Amplification and Zero-Shot Late Fusion in Audio-Language Models for Speech Emotion Recognition

Electrical Engineering and Systems Science > Audio and Speech Processing arXiv:2603.23057 (eess) [Submitted on 24 Mar 2026] Title:Prompt Amplification and Zero-Shot Late Fusion in Audio-Language Models for Speech Emotion Recognition Authors:Saurabh Kataria, Xiao Hu View a PDF of the paper titled Prompt Amplification and Zero-Shot Late Fusion in Audio-Language Models for Speech Emotion Recognition, by Saurabh Kataria and 1 other authors View PDF HTML (experimental) Abstract:Audio-Language Models (ALMs) are making strides in understanding speech and non-speech audio. However, domain-specialist Foundation Models (FMs) remain the best for closed-ended speech processing tasks such as Speech Emotion Recognition (SER). Using ALMs for Zero-shot SER is a popular choice, but their potential to work with specialists to achieve state-of-the-art (SOTA) performance remains unexplored. We propose ZS-Fuse, a late-fusion method that combines zero-shot emotion estimates from a dual-encoder ALM with specialist FMs. To handle ambiguity in emotions and sensitivity to prompt choice, 1) we use a simple prompt ensemble and 2) suggest a novel technique called prompt amplification, which repeats audio and text queries to discover stronger zero-shot capabilities. We demonstrate the efficacy of our technique by evaluating ZS-Fuse with three dual-encoder ALMs and two FMs, and report improvements over SOTA baselines, such as WavLM-Large, on three speech emotion recognition datasets. Subjects: Audio and...

Originally published on March 25, 2026. Curated by AI News.

Related Articles

Apple to open Siri to rival AI services beyond ChatGPT
Llms

Apple to open Siri to rival AI services beyond ChatGPT

Apple plans to open its Siri voice assistant to rival artificial intelligence (AI) services, moving beyond its partnership with OpenAI, a...

AI Tools & Products · 4 min ·
Claude's scheduled tasks finally fixed what ChatGPT, Gemini, and every other AI tool got wrong
Llms

Claude's scheduled tasks finally fixed what ChatGPT, Gemini, and every other AI tool got wrong

The boring stuff finally does itself.

AI Tools & Products · 9 min ·
Llms

ChatGPT Just Got 33% More Accurate (The AI News You Missed)

ChatGPT has improved its accuracy by 33%, marking a notable enhancement for users of the AI platform.

AI Tools & Products · 1 min ·
Llms

Exclusive | The Sudden Fall of OpenAI’s Most Hyped Product Since ChatGPT

The content discusses the sudden decline of OpenAI's most anticipated product since ChatGPT.

AI Tools & Products · 1 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime