[2603.01694] MVR: Multi-view Video Reward Shaping for Reinforcement Learning

[2603.01694] MVR: Multi-view Video Reward Shaping for Reinforcement Learning

arXiv - Machine Learning 4 min read

About this article

Abstract page for arXiv paper 2603.01694: MVR: Multi-view Video Reward Shaping for Reinforcement Learning

Computer Science > Computer Vision and Pattern Recognition arXiv:2603.01694 (cs) [Submitted on 2 Mar 2026] Title:MVR: Multi-view Video Reward Shaping for Reinforcement Learning Authors:Lirui Luo, Guoxi Zhang, Hongming Xu, Yaodong Yang, Cong Fang, Qing Li View a PDF of the paper titled MVR: Multi-view Video Reward Shaping for Reinforcement Learning, by Lirui Luo and 5 other authors View PDF HTML (experimental) Abstract:Reward design is of great importance for solving complex tasks with reinforcement learning. Recent studies have explored using image-text similarity produced by vision-language models (VLMs) to augment rewards of a task with visual feedback. A common practice linearly adds VLM scores to task or success rewards without explicit shaping, potentially altering the optimal policy. Moreover, such approaches, often relying on single static images, struggle with tasks whose desired behavior involves complex, dynamic motions spanning multiple visually different states. Furthermore, single viewpoints can occlude critical aspects of an agent's behavior. To address these issues, this paper presents Multi-View Video Reward Shaping (MVR), a framework that models the relevance of states regarding the target task using videos captured from multiple viewpoints. MVR leverages video-text similarity from a frozen pre-trained VLM to learn a state relevance function that mitigates the bias towards specific static poses inherent in image-based methods. Additionally, we introduce a ...

Originally published on March 03, 2026. Curated by AI News.

Related Articles

Llms

8 free AI courses from Anthropic’s Claude platform with certificates

AI News - General ·
Llms

How is mythos mythos ? [D]

Hello, I’ve been seeing discussions about “Mythos AI” showing behaviors that seem far beyond simple text prediction—like accessing inform...

Reddit - Machine Learning · 1 min ·
Llms

Claude developer hosts Christian leaders for AI summit

AI Tools & Products ·
CoreWeave stock pops 11% on deal to power Anthropic's Claude
Llms

CoreWeave stock pops 11% on deal to power Anthropic's Claude

AI Tools & Products · 3 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime