[2602.03151] Enhancing Foundation VLM Robustness to Missing Modality:

[2602.03151] Enhancing Foundation VLM Robustness to Missing Modality: Scalable Diffusion for Bi-directional Feature Restoration

arXiv - AI April 07, 2026 4 min read

About this article

Abstract page for arXiv paper 2602.03151: Enhancing Foundation VLM Robustness to Missing Modality: Scalable Diffusion for Bi-directional Feature Restoration

Computer Science > Artificial Intelligence arXiv:2602.03151 (cs) [Submitted on 3 Feb 2026 (v1), last revised 6 Apr 2026 (this version, v2)] Title:Enhancing Foundation VLM Robustness to Missing Modality: Scalable Diffusion for Bi-directional Feature Restoration Authors:Wei Dai, Haoyu Wang, Honghao Chang, Lijun He, Fan Li, Jian Sun, Haixia Bi View a PDF of the paper titled Enhancing Foundation VLM Robustness to Missing Modality: Scalable Diffusion for Bi-directional Feature Restoration, by Wei Dai and 6 other authors View PDF HTML (experimental) Abstract:Vision Language Model (VLM) typically assume complete modality input during inference. However, their effectiveness drops sharply when certain modalities are unavailable or incomplete. Current research on missing modality primarily faces two dilemmas: Prompt-based methods struggle to restore missing yet indispensable features and degrade the generalizability of VLM. Imputation-based approaches, lacking effective guidance, are prone to generating semantically irrelevant noise. Restoring precise semantics while sustaining VLM's generalization remains challenging. Therefore, we propose a general missing modality restoration strategy in this paper. We introduce an enhanced diffusion model as a pluggable mid-stage training module to effectively restore missing features. Our strategy introduces two key innovations: (I) Dynamic Modality Gating, which adaptively leverages conditional features to guide the generation of semantically ...

Originally published on April 07, 2026. Curated by AI News.

Llms

ChatGPT downloads are slowing — and may cause problems for OpenAI’s IPO | The Verge

Data from Sensor Tower shows ChatGPT’s growth is slowing down, as Claude and other competitors’ growth is increasing, just as OpenAI is p...

The Verge - AI · 4 min · about 1 hour ago

Llms

Larry Ellison’s betting everything on OpenAI. Will it pay off or pop the bubble? | The Verge

Larry Ellison and Oracle have staked their future on a data center deal with OpenAI and a big bet that enterprise AI will pay off.

The Verge - AI · 32 min · about 1 hour ago

Llms

Google just released Deep Research Max — an autonomous research agent that writes expert-grade reports on its own

Google quietly dropped something interesting last week. They updated their Deep Research agent (available via Gemini API) and introduced ...

Reddit - Artificial Intelligence · 1 min · about 2 hours ago

Llms

When Robots Have Their ChatGPT Moment, Remember These Pincers | WIRED

From sorting chicken nuggets to screwing in light bulbs, Eka’s robots are eerily lifelike. But do they have real physical smarts?

Wired - AI · 13 min · about 4 hours ago

[2602.03151] Enhancing Foundation VLM Robustness to Missing Modality: Scalable Diffusion for Bi-directional Feature Restoration

About this article

Related Articles

ChatGPT downloads are slowing — and may cause problems for OpenAI’s IPO | The Verge

Larry Ellison’s betting everything on OpenAI. Will it pay off or pop the bubble? | The Verge

Google just released Deep Research Max — an autonomous research agent that writes expert-grade reports on its own

When Robots Have Their ChatGPT Moment, Remember These Pincers | WIRED

No comments

Stay updated with AI News