[2505.05619] LiteLMGuard: Seamless and Lightweight On-Device Prompt Filtering for Safeguarding Small Language Models against Quantization-induced Risks and Vulnerabilities
About this article
Abstract page for arXiv paper 2505.05619: LiteLMGuard: Seamless and Lightweight On-Device Prompt Filtering for Safeguarding Small Language Models against Quantization-induced Risks and Vulnerabilities
Computer Science > Cryptography and Security arXiv:2505.05619 (cs) [Submitted on 8 May 2025 (v1), last revised 3 Mar 2026 (this version, v3)] Title:LiteLMGuard: Seamless and Lightweight On-Device Prompt Filtering for Safeguarding Small Language Models against Quantization-induced Risks and Vulnerabilities Authors:Kalyan Nakka, Jimmy Dani, Ausmit Mondal, Nitesh Saxena View a PDF of the paper titled LiteLMGuard: Seamless and Lightweight On-Device Prompt Filtering for Safeguarding Small Language Models against Quantization-induced Risks and Vulnerabilities, by Kalyan Nakka and 3 other authors View PDF HTML (experimental) Abstract:The growing adoption of Large Language Models (LLMs) has influenced the development of Small Language Models (SLMs) for on-device deployment across smartphones and edge devices, offering enhanced privacy, reduced latency, server-free functionality, and improved user experience. However, due to on-device resource constraints, SLMs undergo size optimization through compression techniques like quantization, which inadvertently introduce fairness, ethical and privacy risks. Critically, quantized SLMs may respond to harmful queries directly, without requiring adversarial manipulation, raising significant safety and trust concerns. To address this, we propose LiteLMGuard, an on-device guardrail that provides real-time, prompt-level defense for quantized SLMs. Additionally, our guardrail is designed to be model-agnostic such that it can be seamlessly integr...