[2602.21593] Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection
Summary
The paper introduces a novel attack method, Coherence-Preserving Semantic Injection (CSI), that exploits vulnerabilities in semantic-aware watermarking used in generative images, revealing significant security weaknesses against LLM-guided manipulations.
Why It Matters
As generative images and semantic watermarking become prevalent for copyright protection, understanding vulnerabilities is crucial for improving security measures. This research highlights a significant flaw in current watermarking techniques, which could impact content authenticity and copyright enforcement in digital media.
Key Takeaways
- Introduces CSI attack that manipulates watermark signals in generative images.
- Demonstrates that current semantic watermarking methods are vulnerable to LLM-driven attacks.
- Highlights the importance of enhancing watermarking techniques to prevent misuse.
- Empirical results show CSI outperforms existing attack methods.
- Calls for a reevaluation of security protocols in semantic watermarking.
Computer Science > Machine Learning arXiv:2602.21593 (cs) [Submitted on 25 Feb 2026] Title:Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection Authors:Zheng Gao, Xiaoyu Li, Zhicheng Bao, Xiaoyan Feng, Jiaojiao Jiang View a PDF of the paper titled Breaking Semantic-Aware Watermarks via LLM-Guided Coherence-Preserving Semantic Injection, by Zheng Gao and 4 other authors View PDF HTML (experimental) Abstract:Generative images have proliferated on Web platforms in social media and online copyright distribution scenarios, and semantic watermarking has increasingly been integrated into diffusion models to support reliable provenance tracking and forgery prevention for web content. Traditional noise-layer-based watermarking, however, remains vulnerable to inversion attacks that can recover embedded signals. To mitigate this, recent content-aware semantic watermarking schemes bind watermark signals to high-level image semantics, constraining local edits that would otherwise disrupt global coherence. Yet, large language models (LLMs) possess structured reasoning capabilities that enable targeted exploration of semantic spaces, allowing locally fine-grained but globally coherent semantic alterations that invalidate such bindings. To expose this overlooked vulnerability, we introduce a Coherence-Preserving Semantic Injection (CSI) attack that leverages LLM-guided semantic manipulation under embedding-space similarity constraints. This alignment e...