[2512.10766] Metaphor-based Jailbreak Attacks on Text-to-Image Models

[2512.10766] Metaphor-based Jailbreak Attacks on Text-to-Image Models

arXiv - AI 4 min read

About this article

Abstract page for arXiv paper 2512.10766: Metaphor-based Jailbreak Attacks on Text-to-Image Models

Computer Science > Cryptography and Security arXiv:2512.10766 (cs) [Submitted on 6 Dec 2025 (v1), last revised 24 Mar 2026 (this version, v2)] Title:Metaphor-based Jailbreak Attacks on Text-to-Image Models Authors:Chenyu Zhang, Lanjun Wang, Yiwen Ma, Wenhui Li, Yi Tu, An-An Liu View a PDF of the paper titled Metaphor-based Jailbreak Attacks on Text-to-Image Models, by Chenyu Zhang and 5 other authors View PDF HTML (experimental) Abstract:Text-to-image (T2I) models commonly incorporate defense mechanisms to prevent the generation of sensitive images. Unfortunately, recent jailbreak attacks have shown that adversarial prompts can effectively bypass these mechanisms and induce T2I models to produce sensitive content, revealing critical safety vulnerabilities. However, existing attack methods implicitly assume that the attacker knows the type of deployed defenses, which limits their effectiveness against unknown or diverse defense mechanisms. In this work, we reveal an underexplored vulnerability of T2I models to metaphor-based jailbreak attacks (MJA), which aims to attack diverse defense mechanisms without prior knowledge of their type by generating metaphor-based adversarial prompts. Specifically, MJA consists of two modules: an LLM-based multi-agent generation module (LMAG) and an adversarial prompt optimization module (APO). LMAG decomposes the generation of metaphor-based adversarial prompts into three subtasks: metaphor retrieval, context matching, and adversarial prompt...

Originally published on March 25, 2026. Curated by AI News.

Related Articles

Llms

[D] Real-time Student Attention Detection: ResNet vs Facial Landmarks - Which approach for resource-constrained deployment?

I have a problem statement where we are supposed to detect the attention level of student in a classroom, basically output whether he is ...

Reddit - Machine Learning · 1 min ·
Llms

[P] ClaudeFormer: Building a Transformer Out of Claudes — Collaboration Request

I'm looking to work with people interested in math, machine learning, or agentic coding, on creating a multi-agent framework to do fronti...

Reddit - Machine Learning · 1 min ·
UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Machine Learning

[D] Looking for definition of open-world ish learning problem

Hello! Recently I did a project where I initially had around 30 target classes. But at inference, the model had to be able to handle a lo...

Reddit - Machine Learning · 1 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime