[2510.02240] RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning

[2510.02240] RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning

arXiv - AI 4 min read Article

Summary

The paper presents RewardMap, a multi-stage reinforcement learning framework aimed at improving fine-grained visual reasoning in multimodal large language models by addressing sparse rewards and enhancing training efficiency.

Why It Matters

This research addresses a significant challenge in AI, specifically in fine-grained visual reasoning, which is crucial for applications requiring spatial understanding. By proposing a novel approach to reinforcement learning, it enhances the capabilities of MLLMs, potentially impacting various domains such as robotics and computer vision.

Key Takeaways

  • RewardMap tackles sparse rewards in visual reasoning tasks.
  • The framework utilizes a difficulty-aware reward design for richer supervision.
  • It introduces a multi-stage RL scheme for effective cold-start training.
  • Experiments show an average improvement of 3.47% across multiple benchmarks.
  • The proposed methods enhance both visual understanding and reasoning capabilities.

Computer Science > Computer Vision and Pattern Recognition arXiv:2510.02240 (cs) [Submitted on 2 Oct 2025 (v1), last revised 21 Feb 2026 (this version, v2)] Title:RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning Authors:Sicheng Feng, Kaiwen Tuo, Song Wang, Lingdong Kong, Jianke Zhu, Huan Wang View a PDF of the paper titled RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning, by Sicheng Feng and 5 other authors View PDF HTML (experimental) Abstract:Fine-grained visual reasoning remains a core challenge for multimodal large language models (MLLMs). The recently introduced ReasonMap highlights this gap by showing that even advanced MLLMs struggle with spatial reasoning in structured and information-rich settings such as transit maps, a task of clear practical and scientific importance. However, standard reinforcement learning (RL) on such tasks is impeded by sparse rewards and unstable optimization. To address this, we first construct ReasonMap-Plus, an extended dataset that introduces dense reward signals through Visual Question Answering (VQA) tasks, enabling effective cold-start training of fine-grained visual understanding skills. Next, we propose RewardMap, a multi-stage RL framework designed to improve both visual understanding and reasoning capabilities of MLLMs. RewardMap incorporates two key designs. First, we introduce a difficulty-aware reward design tha...

Related Articles

De-aged casts, ChatGPT-generated programs: How AI is changing Korean TV
Llms

De-aged casts, ChatGPT-generated programs: How AI is changing Korean TV

Artificial intelligence is transforming every corner of industry, and television is no exception. Major networks in Korea have recently a...

AI Tools & Products · 4 min ·
[2603.16629] MLLM-based Textual Explanations for Face Comparison
Llms

[2603.16629] MLLM-based Textual Explanations for Face Comparison

Abstract page for arXiv paper 2603.16629: MLLM-based Textual Explanations for Face Comparison

arXiv - AI · 4 min ·
[2603.15159] To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation
Llms

[2603.15159] To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation

Abstract page for arXiv paper 2603.15159: To See is Not to Master: Teaching LLMs to Use Private Libraries for Code Generation

arXiv - AI · 4 min ·
[2602.08316] SWE Context Bench: A Benchmark for Context Learning in Coding
Llms

[2602.08316] SWE Context Bench: A Benchmark for Context Learning in Coding

Abstract page for arXiv paper 2602.08316: SWE Context Bench: A Benchmark for Context Learning in Coding

arXiv - AI · 4 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime