[2603.03410] On Google's SynthID-Text LLM Watermarking System: Theoretical Analysis and Empirical Validation
About this article
Abstract page for arXiv paper 2603.03410: On Google's SynthID-Text LLM Watermarking System: Theoretical Analysis and Empirical Validation
Computer Science > Cryptography and Security arXiv:2603.03410 (cs) [Submitted on 3 Mar 2026] Title:On Google's SynthID-Text LLM Watermarking System: Theoretical Analysis and Empirical Validation Authors:Romina Omidi, Yun Dong, Binghui Wang View a PDF of the paper titled On Google's SynthID-Text LLM Watermarking System: Theoretical Analysis and Empirical Validation, by Romina Omidi and 2 other authors View PDF HTML (experimental) Abstract:Google's SynthID-Text, the first ever production-ready generative watermark system for large language model, designs a novel Tournament-based method that achieves the state-of-the-art detectability for identifying AI-generated texts. The system's innovation lies in: 1) a new Tournament sampling algorithm for watermarking embedding, 2) a detection strategy based on the introduced score function (e.g., Bayesian or mean score), and 3) a unified design that supports both distortionary and non-distortionary watermarking methods. This paper presents the first theoretical analysis of SynthID-Text, with a focus on its detection performance and watermark robustness, complemented by empirical validation. For example, we prove that the mean score is inherently vulnerable to increased tournament layers, and design a layer inflation attack to break SynthID-Text. We also prove the Bayesian score offers improved watermark robustness w.r.t. layers and further establish that the optimal Bernoulli distribution for watermark detection is achieved when the pa...