[2604.01840] Not All Tokens See Equally: Perception-Grounded Policy

[2604.01840] Not All Tokens See Equally: Perception-Grounded Policy Optimization for Large Vision-Language Models

arXiv - AI April 09, 2026 4 min read

About this article

Abstract page for arXiv paper 2604.01840: Not All Tokens See Equally: Perception-Grounded Policy Optimization for Large Vision-Language Models

Computer Science > Artificial Intelligence arXiv:2604.01840 (cs) [Submitted on 2 Apr 2026 (v1), last revised 8 Apr 2026 (this version, v2)] Title:Not All Tokens See Equally: Perception-Grounded Policy Optimization for Large Vision-Language Models Authors:Zekai Ye, Qiming Li, Xiaocheng Feng, Ruihan Chen, Ziming Li, Haoyu Ren, Kun Chen, Dandan Tu, Bing Qin View a PDF of the paper titled Not All Tokens See Equally: Perception-Grounded Policy Optimization for Large Vision-Language Models, by Zekai Ye and 8 other authors View PDF HTML (experimental) Abstract:While Reinforcement Learning from Verifiable Rewards (RLVR) has advanced reasoning in Large Vision-Language Models (LVLMs), prevailing frameworks suffer from a foundational methodological flaw: by distributing identical advantages across all generated tokens, these methods inherently dilute the learning signals essential for optimizing the critical, visually-grounded steps of multimodal reasoning. To bridge this gap, we formulate \textit{Token Visual Dependency}, quantifying the causal information gain of visual inputs via the Kullback-Leibler (KL) divergence between visual-conditioned and text-only predictive distributions. Revealing that this dependency is highly sparse and semantically pivotal, we introduce Perception-Grounded Policy Optimization (PGPO), which is a novel fine-grained credit assignment framework that dynamically reshapes advantages at the token level. Through a threshold-gated, mass-conserving mechanism, ...

Originally published on April 09, 2026. Curated by AI News.

Llms

Diffusion for generating/editing ASTs? [D]

I’m not a machine learning expert or anything, but I do enjoy learning about how it all works. I’ve noticed that one of the main limitati...

Reddit - Machine Learning · 1 min · about 1 hour ago

Llms

ChatGPT’s ‘Trusted Contact’ will alert loved ones of safety concerns | The Verge

OpenAI is launching an optional safety feature for ChatGPT that allows adult users to assign an emergency contact for mental health and s...

The Verge - AI · 4 min · about 1 hour ago

Llms

AI is helpful but still not “there” yet

what I mean is that every time I use Claude, or Grok or any of the AI platforms and tools, I realize how far this technology is from repl...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

ChatGPT Has 'Goblin' Mania in the US. In China It Will 'Catch You Steadily' | WIRED

OpenAI's chatbot has some weird linguistic tics in Chinese that are driving users crazy.

Wired - AI · 8 min · about 2 hours ago

[2604.01840] Not All Tokens See Equally: Perception-Grounded Policy Optimization for Large Vision-Language Models

About this article

Related Articles

Diffusion for generating/editing ASTs? [D]

ChatGPT’s ‘Trusted Contact’ will alert loved ones of safety concerns | The Verge

AI is helpful but still not “there” yet

ChatGPT Has 'Goblin' Mania in the US. In China It Will 'Catch You Steadily' | WIRED

No comments

Stay updated with AI News