[2602.12636] Dual-Granularity Contrastive Reward via Generated Episodic Guidance for Efficient Embodied RL

[2602.12636] Dual-Granularity Contrastive Reward via Generated Episodic Guidance for Efficient Embodied RL

arXiv - Machine Learning 4 min read Article

Summary

This paper introduces the Dual-Granularity Contrastive Reward framework, which enhances sample efficiency in reinforcement learning (RL) for embodied tasks without requiring extensive human supervision.

Why It Matters

As reinforcement learning applications expand, particularly in robotics, the challenge of designing effective reward systems remains critical. This research addresses the limitations of existing methods by proposing a novel approach that reduces reliance on human-annotated data, potentially accelerating advancements in autonomous systems.

Key Takeaways

  • The Dual-Granularity Contrastive Reward framework improves sample efficiency in RL.
  • It utilizes generated episodic guidance from a limited number of expert videos.
  • The framework balances coarse and fine-grained rewards to enhance agent training.
  • Extensive experiments demonstrate its effectiveness across diverse tasks.
  • This approach could lead to more autonomous and efficient robotic systems.

Computer Science > Machine Learning arXiv:2602.12636 (cs) [Submitted on 13 Feb 2026] Title:Dual-Granularity Contrastive Reward via Generated Episodic Guidance for Efficient Embodied RL Authors:Xin Liu, Yixuan Li, Yuhui Chen, Yuxing Qin, Haoran Li, Dongbin Zhao View a PDF of the paper titled Dual-Granularity Contrastive Reward via Generated Episodic Guidance for Efficient Embodied RL, by Xin Liu and 5 other authors View PDF Abstract:Designing suitable rewards poses a significant challenge in reinforcement learning (RL), especially for embodied manipulation. Trajectory success rewards are suitable for human judges or model fitting, but the sparsity severely limits RL sample efficiency. While recent methods have effectively improved RL via dense rewards, they rely heavily on high-quality human-annotated data or abundant expert supervision. To tackle these issues, this paper proposes Dual-granularity contrastive reward via generated Episodic Guidance (DEG), a novel framework to seek sample-efficient dense rewards without requiring human annotations or extensive supervision. Leveraging the prior knowledge of large video generation models, DEG only needs a small number of expert videos for domain adaptation to generate dedicated task guidance for each RL episode. Then, the proposed dual-granularity reward that balances coarse-grained exploration and fine-grained matching, will guide the agent to efficiently approximate the generated guidance video sequentially in the contrastive...

Related Articles

UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Accelerating science with AI and simulations
Machine Learning

Accelerating science with AI and simulations

MIT Professor Rafael Gómez-Bombarelli discusses the transformative potential of AI in scientific research, emphasizing its role in materi...

AI News - General · 10 min ·
Improving AI models’ ability to explain their predictions
Machine Learning

Improving AI models’ ability to explain their predictions

AI News - General · 9 min ·
[2603.14841] Real-Time Driver Safety Scoring Through Inverse Crash Probability Modeling
Machine Learning

[2603.14841] Real-Time Driver Safety Scoring Through Inverse Crash Probability Modeling

Abstract page for arXiv paper 2603.14841: Real-Time Driver Safety Scoring Through Inverse Crash Probability Modeling

arXiv - AI · 4 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime