[2603.28972] Privacy Guard & Token Parsimony by Prompt and Context Handling and LLM Routing
About this article
Abstract page for arXiv paper 2603.28972: Privacy Guard & Token Parsimony by Prompt and Context Handling and LLM Routing
Computer Science > Cryptography and Security arXiv:2603.28972 (cs) [Submitted on 30 Mar 2026] Title:Privacy Guard & Token Parsimony by Prompt and Context Handling and LLM Routing Authors:Alessio Langiu View a PDF of the paper titled Privacy Guard & Token Parsimony by Prompt and Context Handling and LLM Routing, by Alessio Langiu View PDF HTML (experimental) Abstract:The large-scale adoption of Large Language Models (LLMs) forces a trade-off between operational cost (OpEx) and data privacy. Current routing frameworks reduce costs but ignore prompt sensitivity, exposing users and institutions to leakage risks towards third-party cloud providers. We formalise the "Inseparability Paradigm": advanced context management intrinsically coincides with privacy management. We propose a local "Privacy Guard" -- a holistic contextual observer powered by an on-premise Small Language Model (SLM) -- that performs abstractive summarisation and Automatic Prompt Optimisation (APO) to decompose prompts into focused sub-tasks, re-routing high-risk queries to Zero-Trust or NDA-covered models. This dual mechanism simultaneously eliminates sensitive inference vectors (Zero Leakage) and reduces cloud token payloads (OpEx Reduction). A LIFO-based context compacting mechanism further bounds working memory, limiting the emergent leakage surface. We validate the framework through a 2x2 benchmark (Lazy vs. Expert users; Personal vs. Institutional secrets) on a 1,000-sample dataset, achieving a 45% blen...