[2511.17879] Generative Adversarial Post-Training Mitigates Reward Hacking in Live Human-AI Music Interaction

[2511.17879] Generative Adversarial Post-Training Mitigates Reward Hacking in Live Human-AI Music Interaction

arXiv - Machine Learning 4 min read Article

Summary

This paper presents a novel method using generative adversarial training to address reward hacking in real-time human-AI music interactions, enhancing creativity and adaptability.

Why It Matters

As AI systems increasingly engage in collaborative tasks, such as music jamming, ensuring they maintain creativity and responsiveness is crucial. This research provides a solution to reward hacking, a significant issue in reinforcement learning that can hinder the effectiveness of AI in dynamic environments.

Key Takeaways

  • Introduces a generative adversarial training method to improve AI music interaction.
  • Addresses the issue of reward hacking that reduces output diversity in AI systems.
  • Demonstrates improved adaptability and user agency in live music settings.
  • Utilizes both quantitative evaluations and user studies to validate findings.
  • Highlights the importance of maintaining creativity in AI collaborations.

Computer Science > Machine Learning arXiv:2511.17879 (cs) [Submitted on 22 Nov 2025 (v1), last revised 15 Feb 2026 (this version, v3)] Title:Generative Adversarial Post-Training Mitigates Reward Hacking in Live Human-AI Music Interaction Authors:Yusong Wu, Stephen Brade, Aleksandra Teng Ma, Tia-Jane Fowler, Enning Yang, Berker Banar, Aaron Courville, Natasha Jaques, Cheng-Zhi Anna Huang View a PDF of the paper titled Generative Adversarial Post-Training Mitigates Reward Hacking in Live Human-AI Music Interaction, by Yusong Wu and 8 other authors View PDF HTML (experimental) Abstract:Most applications of generative AI involve a sequential interaction in which a person inputs a prompt and waits for a response, and where reaction time and adaptivity are not important factors. In contrast, live jamming is a collaborative interaction that requires real-time coordination and adaptation without access to the other player's future moves, while preserving diversity to sustain a creative flow. Reinforcement learning post-training enables effective adaptation through on-policy interaction, yet it often reduces output diversity by exploiting coherence-based rewards. This collapse, known as ``reward hacking'', affects many RL post-training pipelines, but is especially harmful in live jamming, where musical creativity relies on dynamic variation and mutual responsiveness. In this paper, we propose a novel adversarial training method on policy-generated trajectories to mitigate reward ha...

Related Articles

UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Improving AI models’ ability to explain their predictions
Machine Learning

Improving AI models’ ability to explain their predictions

AI News - General · 9 min ·
AI Hiring Growth: AI and ML Hiring Surges 37% in Marche
Machine Learning

AI Hiring Growth: AI and ML Hiring Surges 37% in Marche

AI News - General · 1 min ·
Anthropic Claude AI training model targets AI skills gap | ETIH EdTech News
Llms

Anthropic Claude AI training model targets AI skills gap | ETIH EdTech News

AI in education, edtech AI tools, and AI skills training drive Anthropic’s Claude curriculum. ETIH edtech news covers how AI fluency, wor...

AI Tools & Products · 6 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime