[2501.02406] A Training-free Method for LLM Text Attribution
About this article
Abstract page for arXiv paper 2501.02406: A Training-free Method for LLM Text Attribution
Statistics > Machine Learning arXiv:2501.02406 (stat) [Submitted on 4 Jan 2025 (v1), last revised 22 Mar 2026 (this version, v5)] Title:A Training-free Method for LLM Text Attribution Authors:Tara Radvand, Mojtaba Abdolmaleki, Mohamed Mostagir, Ambuj Tewari View a PDF of the paper titled A Training-free Method for LLM Text Attribution, by Tara Radvand and 3 other authors View PDF HTML (experimental) Abstract:Verifying the provenance of content is crucial to the functioning of many organizations, e.g., educational institutions, social media platforms, and firms. This problem is becoming increasingly challenging as text generated by Large Language Models (LLMs) becomes almost indistinguishable from human-generated content. In addition, many institutions use in-house LLMs and want to ensure that external, non-sanctioned LLMs do not produce content within their institutions. In this paper, we answer the following question: Given a piece of text, can we identify whether it was produced by a particular LLM, while ensuring a guaranteed low false positive rate? We model LLM text as a sequential stochastic process with complete dependence on history. We then design zero-shot statistical tests to (i) distinguish between text generated by two different known sets of LLMs $A$ (non-sanctioned) and $B$ (in-house), and (ii) identify whether text was generated by a known LLM or by any unknown model. We prove that the Type I and Type II errors of our test decrease exponentially with the le...