[2510.05181] Auditing Pay-Per-Token in Large Language Models
About this article
Abstract page for arXiv paper 2510.05181: Auditing Pay-Per-Token in Large Language Models
Computer Science > Cryptography and Security arXiv:2510.05181 (cs) [Submitted on 5 Oct 2025 (v1), last revised 23 Mar 2026 (this version, v2)] Title:Auditing Pay-Per-Token in Large Language Models Authors:Ander Artola Velasco, Stratis Tsirtsis, Manuel Gomez-Rodriguez View a PDF of the paper titled Auditing Pay-Per-Token in Large Language Models, by Ander Artola Velasco and 2 other authors View PDF HTML (experimental) Abstract:Millions of users rely on a market of cloud-based services to obtain access to state-of-the-art large language models. However, it has been very recently shown that the de facto pay-per-token pricing mechanism used by providers creates a financial incentive for them to strategize and misreport the (number of) tokens a model used to generate an output. In this paper, we develop an auditing framework based on martingale theory that enables a trusted third-party auditor who sequentially queries a provider to detect token misreporting. Crucially, we show that our framework is guaranteed to always detect token misreporting, regardless of the provider's (mis-)reporting policy, and not falsely flag a faithful provider as unfaithful with high probability. To validate our auditing framework, we conduct experiments across a wide range of (mis-)reporting policies using several large language models from the $\texttt{Llama}$, $\texttt{Gemma}$ and $\texttt{Ministral}$ families, and input prompts from a popular crowdsourced benchmarking platform. The results show t...