[2511.19413] UniGame: Turning a Unified Multimodal Model Into Its Own

[2511.19413] UniGame: Turning a Unified Multimodal Model Into Its Own Adversary

arXiv - Machine Learning March 31, 2026 4 min read

About this article

Abstract page for arXiv paper 2511.19413: UniGame: Turning a Unified Multimodal Model Into Its Own Adversary

Computer Science > Machine Learning arXiv:2511.19413 (cs) [Submitted on 24 Nov 2025 (v1), last revised 30 Mar 2026 (this version, v3)] Title:UniGame: Turning a Unified Multimodal Model Into Its Own Adversary Authors:Zhaolong Su, Wang Lu, Hao Chen, Sharon Li, Jindong Wang View a PDF of the paper titled UniGame: Turning a Unified Multimodal Model Into Its Own Adversary, by Zhaolong Su and 4 other authors View PDF HTML (experimental) Abstract:Unified Multimodal Models (UMMs) have shown impressive performance in both understanding and generation with a single architecture. However, UMMs still exhibit a fundamental inconsistency: understanding favors compact embeddings, whereas generation favors reconstruction-rich representations. This structural trade-off produces misaligned decision boundaries, degraded cross-modal coherence, and heightened vulnerability under distributional and adversarial shifts. In this paper, we present UniGame, a self-adversarial post-training framework that directly targets the inconsistencies. By applying a lightweight perturber at the shared token interface, UniGame enables the generation branch to actively seek and challenge fragile understanding, turning the model itself into its own adversary. Experiments demonstrate that UniGame significantly improves the consistency (+4.6%). Moreover, it also achieves substantial improvements in understanding (+3.6%), generation (+0.02)on GenEval, out-of-distribution and adversarial robustness (+4.8% and +6.2% o...

Originally published on March 31, 2026. Curated by AI News.

Machine Learning

Is it actually possible to build a model-agnostic persistent text layer that keeps AI behavior stable?

Is it actually possible to define a persistent, model-agnostic text-based layer (loaded with the model each time) that keeps an AI system...

Reddit - Artificial Intelligence · 1 min · 21 minutes ago

Machine Learning

Are gamers being used as free labeling labor? The rise of "Simulators" that look like AI training grounds [D]

Hey everyone, I’m an AI news curator and editor currently working on a piece about a weird trend I’ve been spotting: technical simulators...

Reddit - Machine Learning · 1 min · about 1 hour ago

Machine Learning

Coherence Without Convergence: A New Protocol for Multi-Agent AI

Opening For the past year, most progress in multi-agent AI has followed a familiar pattern: Add more agents. Add more coordination. Watch...

Reddit - Artificial Intelligence · 1 min · about 3 hours ago

Machine Learning

Week 6 AIPass update - answering the top questions from last post (file conflicts, remote models, scale)

Followup to last post with answers to the top questions from the comments. Appreciate everyone who jumped in. The most common one by a mi...

Reddit - Artificial Intelligence · 1 min · about 5 hours ago

[2511.19413] UniGame: Turning a Unified Multimodal Model Into Its Own Adversary

About this article

Related Articles

Is it actually possible to build a model-agnostic persistent text layer that keeps AI behavior stable?

Are gamers being used as free labeling labor? The rise of "Simulators" that look like AI training grounds [D]

Coherence Without Convergence: A New Protocol for Multi-Agent AI

Week 6 AIPass update - answering the top questions from last post (file conflicts, remote models, scale)

No comments

Stay updated with AI News