[2510.13829] A Linguistics-Aware LLM Watermarking via Syntactic Predictability
About this article
Abstract page for arXiv paper 2510.13829: A Linguistics-Aware LLM Watermarking via Syntactic Predictability
Computer Science > Computation and Language arXiv:2510.13829 (cs) [Submitted on 10 Oct 2025 (v1), last revised 6 Apr 2026 (this version, v2)] Title:A Linguistics-Aware LLM Watermarking via Syntactic Predictability Authors:Shinwoo Park, Hyejin Park, Hyeseon Ahn, Yo-Sub Han View a PDF of the paper titled A Linguistics-Aware LLM Watermarking via Syntactic Predictability, by Shinwoo Park and 3 other authors View PDF HTML (experimental) Abstract:As large language models (LLMs) continue to advance rapidly, reliable governance tools have become critical. Publicly verifiable watermarking is particularly essential for fostering a trustworthy AI ecosystem. A central challenge persists: balancing text quality against detection robustness. Recent studies have sought to navigate this trade-off by leveraging signals from model output distributions (e.g., token-level entropy); however, their reliance on these model-specific signals presents a significant barrier to public verification, as the detection process requires access to the logits of the underlying model. We introduce STELA, a novel framework that aligns watermark strength with the linguistic degrees of freedom inherent in language. STELA dynamically modulates the signal using part-of-speech (POS) n-gram-modeled linguistic indeterminacy, weakening it in grammatically constrained contexts to preserve quality and strengthen it in contexts with greater linguistic flexibility to enhance detectability. Our detector operates without a...