[2602.22700] IMMACULATE: A Practical LLM Auditing Framework via Verifiable Computation
Summary
The paper presents IMMACULATE, a framework for auditing large language models (LLMs) using verifiable computation to detect economic deviations without needing trusted hardware.
Why It Matters
As LLMs are increasingly deployed as black-box services, ensuring their integrity and proper billing is crucial. IMMACULATE addresses these concerns by providing a method to audit model behavior economically and efficiently, enhancing trust in AI services.
Key Takeaways
- IMMACULATE audits LLMs without requiring access to their internals.
- It detects economically motivated deviations like token overbilling.
- The framework achieves strong detection guarantees with minimal throughput overhead.
- Verifiable computation is used to selectively audit requests.
- The code for IMMACULATE is publicly available, promoting transparency.
Computer Science > Cryptography and Security arXiv:2602.22700 (cs) [Submitted on 26 Feb 2026] Title:IMMACULATE: A Practical LLM Auditing Framework via Verifiable Computation Authors:Yanpei Guo, Wenjie Qu, Linyu Wu, Shengfang Zhai, Lionel Z. Wang, Ming Xu, Yue Liu, Binhang Yuan, Dawn Song, Jiaheng Zhang View a PDF of the paper titled IMMACULATE: A Practical LLM Auditing Framework via Verifiable Computation, by Yanpei Guo and 9 other authors View PDF HTML (experimental) Abstract:Commercial large language models are typically deployed as black-box API services, requiring users to trust providers to execute inference correctly and report token usage honestly. We present IMMACULATE, a practical auditing framework that detects economically motivated deviations-such as model substitution, quantization abuse, and token overbilling-without trusted hardware or access to model internals. IMMACULATE selectively audits a small fraction of requests using verifiable computation, achieving strong detection guarantees while amortizing cryptographic overhead. Experiments on dense and MoE models show that IMMACULATE reliably distinguishes benign and malicious executions with under 1% throughput overhead. Our code is published at this https URL. Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI) Cite as: arXiv:2602.22700 [cs.CR] (or arXiv:2602.22700v1 [cs.CR] for this version) https://doi.org/10.48550/arXiv.2602.22700 Focus to learn more arXiv-issued DOI via Data...