Machine Learning Computer Vision Ai Safety

[2602.18252] On the Adversarial Robustness of Discrete Image Tokenizers

arXiv - AI February 23, 2026 3 min read Article

Summary

This paper investigates the adversarial robustness of discrete image tokenizers, highlighting their vulnerabilities and proposing a novel unsupervised adversarial training method to enhance their resilience across various tasks.

Why It Matters

As discrete image tokenizers become integral to multimodal systems, understanding their susceptibility to adversarial attacks is crucial for developing safe and effective AI models. This research provides foundational insights into improving the robustness of these systems, which is vital for real-world applications.

Key Takeaways

Discrete image tokenizers are vulnerable to adversarial attacks, which can disrupt their functionality.
The study formulates efficient, application-agnostic attacks targeting these tokenizers.
An unsupervised adversarial training approach significantly enhances the robustness of tokenizers against various attacks.
The proposed method is versatile, leveraging unlabeled images for improved performance.
This research underscores the importance of tokenizer robustness in the development of multimodal foundation models.

Computer Science > Computer Vision and Pattern Recognition arXiv:2602.18252 (cs) [Submitted on 20 Feb 2026] Title:On the Adversarial Robustness of Discrete Image Tokenizers Authors:Rishika Bhagwatkar, Irina Rish, Nicolas Flammarion, Francesco Croce View a PDF of the paper titled On the Adversarial Robustness of Discrete Image Tokenizers, by Rishika Bhagwatkar and 3 other authors View PDF HTML (experimental) Abstract:Discrete image tokenizers encode visual inputs as sequences of tokens from a finite vocabulary and are gaining popularity in multimodal systems, including encoder-only, encoder-decoder, and decoder-only models. However, unlike CLIP encoders, their vulnerability to adversarial attacks has not been explored. Ours being the first work studying this topic, we first formulate attacks that aim to perturb the features extracted by discrete tokenizers, and thus change the extracted tokens. These attacks are computationally efficient, application-agnostic, and effective across classification, multimodal retrieval, and captioning tasks. Second, to defend against this vulnerability, inspired by recent work on robust CLIP encoders, we fine-tune popular tokenizers with unsupervised adversarial training, keeping all other components frozen. While unsupervised and task-agnostic, our approach significantly improves robustness to both unsupervised and end-to-end supervised attacks and generalizes well to unseen tasks and data. Unlike supervised adversarial training, our approac...

Read Original Article

[2602.18252] On the Adversarial Robustness of Discrete Image Tokenizers

Summary

Why It Matters

Key Takeaways

Related Articles

[P] Unix philosophy for ML pipelines: modular, swappable stages with typed contracts

Making an AI native sovereign computational stack

An attack class that passes every current LLM filter - no payload, no injection signature, no log trace

What tools are sr MLEs using? (clawdbot, openspec, wispr) [D]

No comments

Stay updated with AI News