[2603.00207] VisRef: Visual Refocusing while Thinking Improves

[2603.00207] VisRef: Visual Refocusing while Thinking Improves Test-Time Scaling in Multi-Modal Large Reasoning Models

arXiv - AI March 03, 2026 4 min read

About this article

Abstract page for arXiv paper 2603.00207: VisRef: Visual Refocusing while Thinking Improves Test-Time Scaling in Multi-Modal Large Reasoning Models

Computer Science > Computer Vision and Pattern Recognition arXiv:2603.00207 (cs) [Submitted on 27 Feb 2026] Title:VisRef: Visual Refocusing while Thinking Improves Test-Time Scaling in Multi-Modal Large Reasoning Models Authors:Soumya Suvra Ghosal, Youngeun Kim, Zhuowei Li, Ritwick Chaudhry, Linghan Xu, Hongjing Zhang, Jakub Zablocki, Yifan Xing, Qin Zhang View a PDF of the paper titled VisRef: Visual Refocusing while Thinking Improves Test-Time Scaling in Multi-Modal Large Reasoning Models, by Soumya Suvra Ghosal and 8 other authors View PDF HTML (experimental) Abstract:Advances in large reasoning models have shown strong performance on complex reasoning tasks by scaling test-time compute through extended reasoning. However, recent studies observe that in vision-dependent tasks, extended textual reasoning at inference time can degrade performance as models progressively lose attention to visual tokens and increasingly rely on textual priors alone. To address this, prior works use reinforcement learning (RL)-based fine-tuning to route visual tokens or employ refocusing mechanisms during reasoning. While effective, these methods are computationally expensive, requiring large-scale data generation and policy optimization. To leverage the benefits of test-time compute without additional RL fine-tuning, we propose VisRef, a visually grounded test-time scaling framework. Our key idea is to actively guide the reasoning process by re-injecting a coreset of visual tokens that are ...

Originally published on March 03, 2026. Curated by AI News.

Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min · about 1 hour ago

Machine Learning

AI assistants are optimized to seem helpful. That is not the same thing as being helpful.

RLHF trains models on human feedback. Humans rate responses they like. And it turns out humans consistently rate confident, fluent, agree...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

wtf bro did what? arc 3 2026

The Physarum Explorer is a high-speed, bio-inspired neural model designed specifically for ARC geometry. Here is the snapshot of its curr...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Machine Learning

Meta Pauses Work With Mercor After Data Breach Puts AI Industry Secrets at Risk | WIRED

Major AI labs are investigating a security incident that impacted Mercor, a leading data vendor. The incident could have exposed key data...

Wired - AI · 6 min · about 1 hour ago

[2603.00207] VisRef: Visual Refocusing while Thinking Improves Test-Time Scaling in Multi-Modal Large Reasoning Models

About this article

Related Articles

UMKC Announces New Master of Science in Artificial Intelligence

AI assistants are optimized to seem helpful. That is not the same thing as being helpful.

wtf bro did what? arc 3 2026

Meta Pauses Work With Mercor After Data Breach Puts AI Industry Secrets at Risk | WIRED

No comments

Stay updated with AI News