[2510.12117] Locket: Robust Feature-Locking Technique for Language Models
About this article
Abstract page for arXiv paper 2510.12117: Locket: Robust Feature-Locking Technique for Language Models
Computer Science > Cryptography and Security arXiv:2510.12117 (cs) [Submitted on 14 Oct 2025 (v1), last revised 26 Mar 2026 (this version, v2)] Title:Locket: Robust Feature-Locking Technique for Language Models Authors:Lipeng He, Vasisht Duddu, N. Asokan View a PDF of the paper titled Locket: Robust Feature-Locking Technique for Language Models, by Lipeng He and 2 other authors View PDF HTML (experimental) Abstract:Chatbot service providers (e.g., OpenAI) rely on tiered subscription plans to generate revenue, offering black-box access to basic models for free users and advanced models to paying subscribers. However, this approach is unprofitable and inflexible for the users. A pay-to-unlock scheme for premium features (e.g., math, coding) offers a more sustainable alternative. Enabling such a scheme requires a feature-locking technique (FLoTE) that is (i) effective in refusing locked features, (ii) utility-preserving for unlocked features, (iii) robust against evasion or unauthorized credential sharing, and (iv) scalable to multiple features and clients. Existing FLoTEs (e.g., password-locked models) fail to meet these criteria. To fill this gap, we present Locket, the first robust and scalable FLoTE to enable pay-to-unlock schemes. We develop a framework for adversarial training and merging of feature-locking adapters, which enables Locket to selectively enable or disable specific features of a model. Evaluation shows that Locket is effective ($100$% refusal rate), utilit...