Llms Machine Learning Generative Ai Ai Infrastructure Ai Safety Computer Vision

[2602.21593] Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection

arXiv - Machine Learning February 26, 2026 4 min read Article

Summary

The paper introduces a novel attack method, Coherence-Preserving Semantic Injection (CSI), that exploits vulnerabilities in semantic-aware watermarking used in generative images, revealing significant security weaknesses against LLM-guided manipulations.

Why It Matters

As generative images and semantic watermarking become prevalent for copyright protection, understanding vulnerabilities is crucial for improving security measures. This research highlights a significant flaw in current watermarking techniques, which could impact content authenticity and copyright enforcement in digital media.

Key Takeaways

Introduces CSI attack that manipulates watermark signals in generative images.
Demonstrates that current semantic watermarking methods are vulnerable to LLM-driven attacks.
Highlights the importance of enhancing watermarking techniques to prevent misuse.
Empirical results show CSI outperforms existing attack methods.
Calls for a reevaluation of security protocols in semantic watermarking.

Computer Science > Machine Learning arXiv:2602.21593 (cs) [Submitted on 25 Feb 2026] Title:Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection Authors:Zheng Gao, Xiaoyu Li, Zhicheng Bao, Xiaoyan Feng, Jiaojiao Jiang View a PDF of the paper titled Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection, by Zheng Gao and 4 other authors View PDF HTML (experimental) Abstract:Generative images have proliferated on Web platforms in social media and online copyright distribution scenarios, and semantic watermarking has increasingly been integrated into diffusion models to support reliable provenance tracking and forgery prevention for web content. Traditional noise-layer-based watermarking, however, remains vulnerable to inversion attacks that can recover embedded signals. To mitigate this, recent content-aware semantic watermarking schemes bind watermark signals to high-level image semantics, constraining local edits that would otherwise disrupt global coherence. Yet, large language models (LLMs) possess structured reasoning capabilities that enable targeted exploration of semantic spaces, allowing locally fine-grained but globally coherent semantic alterations that invalidate such bindings. To expose this overlooked vulnerability, we introduce a Coherence-Preserving Semantic Injection (CSI) attack that leverages LLM-guided semantic manipulation under embedding-space similarity constraints. This alignment e...

Read Original Article

[2602.21593] Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection

Summary

Why It Matters

Key Takeaways

Related Articles

Nicolas Carlini (67.2k citations on Google Scholar) says Claude is a better security researcher than him, made $3.7 million from exploiting smart contracts, and found vulnerabilities in Linux and Ghost

[P] I built an autonomous ML agent that runs experiments on tabular data indefinitely - inspired by Karpathy's AutoResearch

[R] BraiNN: An Experimental Neural Architecture with Working Memory, Relational Reasoning, and Adaptive Learning

We hit 150 stars on our AI setup tool!

No comments

Stay updated with AI News