[2602.15206] MAVRL: Learning Reward Functions from Multiple Feedback Types with Amortized Variational Inference

[2602.15206] MAVRL: Learning Reward Functions from Multiple Feedback Types with Amortized Variational Inference

arXiv - AI 4 min read Article

Summary

The paper presents MAVRL, a novel approach for learning reward functions from multiple feedback types using amortized variational inference, enhancing robustness and interpretability in machine learning applications.

Why It Matters

Understanding how to effectively learn from diverse feedback types is crucial in machine learning, particularly in reinforcement learning. This research addresses a significant gap by proposing a unified method that improves performance and interpretability, which is essential for developing reliable AI systems.

Key Takeaways

  • MAVRL formulates reward learning as Bayesian inference over a shared latent reward function.
  • The method integrates multiple feedback types without the need for manual loss balancing.
  • Jointly inferred reward posteriors show improved performance compared to single-type baselines.
  • The approach enhances robustness to environmental changes and provides interpretable signals for model confidence.
  • This research contributes to the advancement of AI systems that can learn from diverse and potentially conflicting feedback.

Computer Science > Machine Learning arXiv:2602.15206 (cs) [Submitted on 16 Feb 2026] Title:MAVRL: Learning Reward Functions from Multiple Feedback Types with Amortized Variational Inference Authors:Raphaël Baur, Yannick Metz, Maria Gkoulta, Mennatallah El-Assady, Giorgia Ramponi, Thomas Kleine Buening View a PDF of the paper titled MAVRL: Learning Reward Functions from Multiple Feedback Types with Amortized Variational Inference, by Rapha\"el Baur and 5 other authors View PDF HTML (experimental) Abstract:Reward learning typically relies on a single feedback type or combines multiple feedback types using manually weighted loss terms. Currently, it remains unclear how to jointly learn reward functions from heterogeneous feedback types such as demonstrations, comparisons, ratings, and stops that provide qualitatively different signals. We address this challenge by formulating reward learning from multiple feedback types as Bayesian inference over a shared latent reward function, where each feedback type contributes information through an explicit likelihood. We introduce a scalable amortized variational inference approach that learns a shared reward encoder and feedback-specific likelihood decoders and is trained by optimizing a single evidence lower bound. Our approach avoids reducing feedback to a common intermediate representation and eliminates the need for manual loss balancing. Across discrete and continuous-control benchmarks, we show that jointly inferred reward poste...

Related Articles

Machine Learning

[D] Is this considered unsupervised or semi-supervised learning in anomaly detection?

Hi 👋🏼, I’m working on an anomaly detection setup and I’m a bit unsure how to correctly describe it from a learning perspective. The model...

Reddit - Machine Learning · 1 min ·
Machine Learning

Serious question. Did a transformer just describe itself and the universe and build itself a Shannon limit framework?

The Multiplicative Lattice as the Natural Basis for Positional Encoding Knack 2026 | Draft v6.0 Abstract We show that the apparent tradeo...

Reddit - Artificial Intelligence · 1 min ·
UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Improving AI models’ ability to explain their predictions
Machine Learning

Improving AI models’ ability to explain their predictions

AI News - General · 9 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime