[2506.10085] VITA: Zero-Shot Value Functions via Test-Time Adaptation

[2506.10085] VITA: Zero-Shot Value Functions via Test-Time Adaptation of Vision-Language Models

arXiv - AI March 03, 2026 4 min read

About this article

Abstract page for arXiv paper 2506.10085: VITA: Zero-Shot Value Functions via Test-Time Adaptation of Vision-Language Models

Computer Science > Computer Vision and Pattern Recognition arXiv:2506.10085 (cs) [Submitted on 11 Jun 2025 (v1), last revised 27 Feb 2026 (this version, v5)] Title:VITA: Zero-Shot Value Functions via Test-Time Adaptation of Vision-Language Models Authors:Christos Ziakas, Alessandra Russo View a PDF of the paper titled VITA: Zero-Shot Value Functions via Test-Time Adaptation of Vision-Language Models, by Christos Ziakas and Alessandra Russo View PDF HTML (experimental) Abstract:Vision-Language Models (VLMs) show promise as zero-shot goal-conditioned value functions, but their frozen pre-trained representations limit generalization and temporal reasoning. We introduce VITA, a zero-shot value function learning method that enhances both capabilities via test-time adaptation. At inference, a lightweight adaptation module is updated via a gradient step on a meta-learned self-supervised loss, such that each test-time update improves value estimation. By updating sequentially over a trajectory, VITA encodes history into its parameters, addressing the temporal reasoning limitations. To mitigate shortcut learning, we propose a dissimilarity-based sampling strategy that selects semantically diverse segments of the trajectory during training. In real-world robotic manipulation tasks, VITA generalizes from a single training environment to diverse out-of-distribution tasks, environments, and embodiments, outperforming the state-of-the-art zero-shot method using autoregressive VLMs. Furt...

Originally published on March 03, 2026. Curated by AI News.

Llms

AI Has Broken the Internet

So the web has been breaking a lot lately. Vercel is down. GitHub is down. Claude is down. Cloudflare is down. AWS is down. Everything is...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

LLM agents can trigger real actions now. But what actually stops them from executing?

We ran into a simple but important issue while building agents with tool calling: the model can propose actions but nothing actually enfo...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

Are LLMs a Dead End? (Investors Just Bet $1 Billion on “Yes”)

| AI Reality Check | Cal Newport Chapters 0:00 What is Yan LeCun Up To? 14:55 How is it possible that LeCun could be right about LLM’s be...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Llms

Mercor says it was hit by cyberattack tied to compromise of open-source LiteLLM project | TechCrunch

The AI recruiting startup confirmed a security incident after an extortion hacking crew took credit for stealing data from the company's ...

TechCrunch - AI · 4 min · about 1 hour ago

[2506.10085] VITA: Zero-Shot Value Functions via Test-Time Adaptation of Vision-Language Models

About this article

Related Articles

AI Has Broken the Internet

LLM agents can trigger real actions now. But what actually stops them from executing?

Are LLMs a Dead End? (Investors Just Bet $1 Billion on “Yes”)

Mercor says it was hit by cyberattack tied to compromise of open-source LiteLLM project | TechCrunch

No comments

Stay updated with AI News