Ai Safety Ai Startups Ai Agents Data Science

[2602.22710] Same Words, Different Judgments: Modality Effects on Preference Alignment

arXiv - AI February 27, 2026 3 min read Article

Summary

This study explores how modality affects preference alignment in AI systems, comparing human and synthetic evaluations of audio and text content. It finds that audio ratings are reliable but exhibit different judgment patterns compared to text.

Why It Matters

Understanding modality effects on preference alignment is crucial for developing AI systems that accurately reflect human preferences. This research highlights the reliability of audio evaluations and their implications for improving AI-human interaction, particularly in reinforcement learning frameworks.

Key Takeaways

Audio preferences show high reliability, comparable to text evaluations.
Modality influences judgment patterns, with audio raters having narrower decision thresholds.
Synthetic ratings can effectively predict human judgments and inter-rater agreement.
This study provides the first ICC-based reliability characterization in preference annotation literature.
Understanding these differences can enhance AI systems' alignment with human preferences.

Computer Science > Sound arXiv:2602.22710 (cs) [Submitted on 26 Feb 2026] Title:Same Words, Different Judgments: Modality Effects on Preference Alignment Authors:Aaron Broukhim, Nadir Weibel, Eshin Jolly View a PDF of the paper titled Same Words, Different Judgments: Modality Effects on Preference Alignment, by Aaron Broukhim and 2 other authors View PDF HTML (experimental) Abstract:Preference-based reinforcement learning (PbRL) is the dominant framework for aligning AI systems to human preferences, but its application to speech remains underexplored. We present a controlled cross-modal study of human and synthetic preference annotations, comparing text and audio evaluations of identical semantic content across 100 prompts. Audio preferences prove as reliable as text, with inter-rater agreement reaching good levels (ICC(2,k) $\approx$ .80) at $\sim$9 raters -- the first ICC-based reliability characterization in the preference annotation literature for either modality. However, modality reshapes how people judge: audio raters exhibit narrower decision thresholds, reduced length bias, and more user-oriented evaluation criteria, with near-chance cross-modality agreement. Synthetic ratings further align with human judgments and predict inter-rater agreement, supporting their use both for triaging ambiguous pairs and as full replacements for human annotations. Comments: Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC) Cite as: arXiv:2...

Read Original Article

[2602.22710] Same Words, Different Judgments: Modality Effects on Preference Alignment

Summary

Why It Matters

Key Takeaways

Related Articles

Washington needs AI guardrails — now | Opinion

[2601.12910] SciCoQA: Quality Assurance for Scientific Paper--Code Alignment

[2509.21385] Debugging Concept Bottleneck Models through Removal and Retraining

[2512.00804] Epistemic Bias Injection: Biasing LLMs via Selective Context Retrieval

No comments

Stay updated with AI News