[2603.23057] Prompt Amplification and Zero-Shot Late Fusion in

[2603.23057] Prompt Amplification and Zero-Shot Late Fusion in Audio-Language Models for Speech Emotion Recognition

arXiv - Machine Learning March 25, 2026 3 min read

About this article

Abstract page for arXiv paper 2603.23057: Prompt Amplification and Zero-Shot Late Fusion in Audio-Language Models for Speech Emotion Recognition

Electrical Engineering and Systems Science > Audio and Speech Processing arXiv:2603.23057 (eess) [Submitted on 24 Mar 2026] Title:Prompt Amplification and Zero-Shot Late Fusion in Audio-Language Models for Speech Emotion Recognition Authors:Saurabh Kataria, Xiao Hu View a PDF of the paper titled Prompt Amplification and Zero-Shot Late Fusion in Audio-Language Models for Speech Emotion Recognition, by Saurabh Kataria and 1 other authors View PDF HTML (experimental) Abstract:Audio-Language Models (ALMs) are making strides in understanding speech and non-speech audio. However, domain-specialist Foundation Models (FMs) remain the best for closed-ended speech processing tasks such as Speech Emotion Recognition (SER). Using ALMs for Zero-shot SER is a popular choice, but their potential to work with specialists to achieve state-of-the-art (SOTA) performance remains unexplored. We propose ZS-Fuse, a late-fusion method that combines zero-shot emotion estimates from a dual-encoder ALM with specialist FMs. To handle ambiguity in emotions and sensitivity to prompt choice, 1) we use a simple prompt ensemble and 2) suggest a novel technique called prompt amplification, which repeats audio and text queries to discover stronger zero-shot capabilities. We demonstrate the efficacy of our technique by evaluating ZS-Fuse with three dual-encoder ALMs and two FMs, and report improvements over SOTA baselines, such as WavLM-Large, on three speech emotion recognition datasets. Subjects: Audio and...

Originally published on March 25, 2026. Curated by AI News.

Llms

Apple to open Siri to rival AI services beyond ChatGPT

Apple plans to open its Siri voice assistant to rival artificial intelligence (AI) services, moving beyond its partnership with OpenAI, a...

AI Tools & Products · 4 min · 29 minutes ago

Llms

Claude's scheduled tasks finally fixed what ChatGPT, Gemini, and every other AI tool got wrong

The boring stuff finally does itself.

AI Tools & Products · 9 min · 29 minutes ago

Llms

ChatGPT Just Got 33% More Accurate (The AI News You Missed)

ChatGPT has improved its accuracy by 33%, marking a notable enhancement for users of the AI platform.

AI Tools & Products · 1 min · 29 minutes ago

Llms

Exclusive | The Sudden Fall of OpenAI’s Most Hyped Product Since ChatGPT

The content discusses the sudden decline of OpenAI's most anticipated product since ChatGPT.

AI Tools & Products · 1 min · 29 minutes ago

[2603.23057] Prompt Amplification and Zero-Shot Late Fusion in Audio-Language Models for Speech Emotion Recognition

About this article

Related Articles

Apple to open Siri to rival AI services beyond ChatGPT

Claude's scheduled tasks finally fixed what ChatGPT, Gemini, and every other AI tool got wrong

ChatGPT Just Got 33% More Accurate (The AI News You Missed)

Exclusive | The Sudden Fall of OpenAI’s Most Hyped Product Since ChatGPT

No comments

Stay updated with AI News