[2604.07486] Private Seeds, Public LLMs: Realistic and Privacy-Preserving Synthetic Data Generation

[2604.07486] Private Seeds, Public LLMs: Realistic and Privacy-Preserving Synthetic Data Generation

arXiv - AI 3 min read

About this article

Abstract page for arXiv paper 2604.07486: Private Seeds, Public LLMs: Realistic and Privacy-Preserving Synthetic Data Generation

Computer Science > Cryptography and Security arXiv:2604.07486 (cs) [Submitted on 8 Apr 2026 (v1), last revised 11 Apr 2026 (this version, v2)] Title:Private Seeds, Public LLMs: Realistic and Privacy-Preserving Synthetic Data Generation Authors:Qian Ma, Sarah Rajtmajer View a PDF of the paper titled Private Seeds, Public LLMs: Realistic and Privacy-Preserving Synthetic Data Generation, by Qian Ma and 1 other authors View PDF HTML (experimental) Abstract:Large language models (LLMs) have emerged as a powerful tool for synthetic data generation. A particularly important use case is producing synthetic replicas of private text, which requires carefully balancing privacy and utility. We propose Realistic and Privacy-Preserving Synthetic Data Generation (RPSG), which uses private seeds and integrates privacy-preserving strategies, including a formal differential privacy (DP) mechanism in the candidate selection, to generate realistic synthetic data. Comprehensive experiments against state-of-the-art private synthetic data generation methods demonstrate that RPSG achieves high fidelity to private data while providing strong privacy protection. Comments: Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI) Cite as: arXiv:2604.07486 [cs.CR]   (or arXiv:2604.07486v2 [cs.CR] for this version)   https://doi.org/10.48550/arXiv.2604.07486 Focus to learn more arXiv-issued DOI via DataCite Submission history From: Qian Ma [view email] [v1] Wed, 8 Apr 2026 18:26:34 ...

Originally published on April 14, 2026. Curated by AI News.

Related Articles

Llms

openclaw ai agent vs just using chatgpt

I've been using AI tools pretty heavily for the past couple of years. ChatGPT, Claude, Perplexity, a few others. I thought I had a good m...

Reddit - Artificial Intelligence · 1 min ·
Llms

We benchmarked TranslateGemma against 5 other LLMs on subtitle translation across 6 languages. At first glance the numbers told a clean story, but then human QA added a chapter. [D]

We evaluated six models on English subtitle translation into Spanish, Japanese, Korean, Thai, Chinese Simplified, and Chinese Traditional...

Reddit - Machine Learning · 1 min ·
Claude Mythos and Project Glasswing: why an AI superhacker has the tech world on alert
Llms

Claude Mythos and Project Glasswing: why an AI superhacker has the tech world on alert

A new AI model could automate the process of searching for cybersecurity bugs and flaws – for better or worse.

AI Tools & Products · 5 min ·
Gemini could take a 'proactive' approach with leaked 'Your Day' feature
Llms

Gemini could take a 'proactive' approach with leaked 'Your Day' feature

This feature could leverage your apps in a way that might feel familiar.

AI Tools & Products · 5 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime