[2602.14374] Differentially Private Retrieval-Augmented Generation
Summary
The paper presents DP-KSA, a novel algorithm that integrates differential privacy into retrieval-augmented generation (RAG) systems, addressing privacy risks while maintaining utility in large language models (LLMs).
Why It Matters
As LLMs are increasingly used in sensitive domains, ensuring privacy while preserving the quality of generated responses is critical. This research offers a solution to mitigate privacy risks associated with RAG systems, making it relevant for developers and researchers in AI and data privacy.
Key Takeaways
- DP-KSA integrates differential privacy into RAG systems effectively.
- The algorithm addresses privacy risks without significantly degrading utility.
- Empirical results show a strong privacy-utility tradeoff in QA tasks.
Computer Science > Cryptography and Security arXiv:2602.14374 (cs) [Submitted on 16 Feb 2026] Title:Differentially Private Retrieval-Augmented Generation Authors:Tingting Tang, James Flemings, Yongqin Wang, Murali Annavaram View a PDF of the paper titled Differentially Private Retrieval-Augmented Generation, by Tingting Tang and 3 other authors View PDF HTML (experimental) Abstract:Retrieval-augmented generation (RAG) is a widely used framework for reducing hallucinations in large language models (LLMs) on domain-specific tasks by retrieving relevant documents from a database to support accurate responses. However, when the database contains sensitive corpora, such as medical records or legal documents, RAG poses serious privacy risks by potentially exposing private information through its outputs. Prior work has demonstrated that one can practically craft adversarial prompts that force an LLM to regurgitate the augmented contexts. A promising direction is to integrate differential privacy (DP), a privacy notion that offers strong formal guarantees, into RAG systems. However, naively applying DP mechanisms into existing systems often leads to significant utility degradation. Particularly for RAG systems, DP can reduce the usefulness of the augmented contexts leading to increase risk of hallucination from the LLMs. Motivated by these challenges, we present DP-KSA, a novel privacy-preserving RAG algorithm that integrates DP using the propose-test-release paradigm. DP-KSA fol...