[2602.12394] Synthetic Interaction Data for Scalable Personalization in Large Language Models

[2602.12394] Synthetic Interaction Data for Scalable Personalization in Large Language Models

arXiv - Machine Learning 4 min read Article

Summary

The paper introduces PersonaGym, a framework for generating synthetic interaction data to enhance personalization in large language models (LLMs). It addresses the limitations of existing prompt optimization methods by modeling dynamic user preferences and providing a scalable...

Why It Matters

As large language models become integral to various applications, effective personalization is crucial for user satisfaction. This research addresses the challenges of data scarcity and user-specific preferences, offering a novel approach that could significantly improve LLM interactions in real-world scenarios.

Key Takeaways

  • PersonaGym generates high-fidelity synthetic data for personalized user interactions.
  • The framework models dynamic user preferences, enhancing the realism of interactions.
  • Personalized Prompt Optimization (PPOpt) improves prompt effectiveness without altering LLMs.
  • Extensive experiments show significant improvements in personalization quality and robustness.
  • The research addresses critical gaps in existing personalization methods for LLMs.

Computer Science > Machine Learning arXiv:2602.12394 (cs) [Submitted on 12 Feb 2026] Title:Synthetic Interaction Data for Scalable Personalization in Large Language Models Authors:Yuchen Ma, Yue Huang, Wenjie Wang, Xiaonan Luo, Xiangliang Zhang, Stefan Feuerriegel View a PDF of the paper titled Synthetic Interaction Data for Scalable Personalization in Large Language Models, by Yuchen Ma and 5 other authors View PDF HTML (experimental) Abstract:Personalized prompting offers large opportunities for deploying large language models (LLMs) to diverse users, yet existing prompt optimization methods primarily focus on task-level optimization while largely overlooking user-specific preferences and latent constraints of individual users. This gap is primarily due to (i) the absence of high-quality, privacy-sensitive data that capture personalized user-LLM interactions at scale, and (ii) the lack of robust reward signals for individual preferences. To overcome existing data limitations, we introduce a high-fidelity synthetic data generation framework called PersonaGym. Unlike prior work that treats personalization as static persona-preference pairs, PersonaGym models a dynamic preference process via an agentic LLM system to simulate realistic preference behaviors and semantic-aware noise in order to generate personalized multi-turn interaction trajectories. Using PersonaGym, we release PersonaAtlas, a large-scale, high-quality, and diverse synthetic dataset of high-fidelity multi-t...

Related Articles

Llms

main skill in software engineering in 2026 is knowing what to ask Claude, not knowing how to code. and I can’t decide if that’s depressing or just the next abstraction layer.

Been writing code professionally for 8+ years. I’m now mass spending more time describing features in plain english than writing actual c...

Reddit - Artificial Intelligence · 1 min ·
Llms

Can we even achieve AGI with LLMs, why do AI bros still believe we can?

I've heard mixed discussions around this. Although not much evidence just rhetoric from the AGI will come from LLMs camp. submitted by /u...

Reddit - Artificial Intelligence · 1 min ·
Llms

You can now prompt OpenClaw into existence. fully 1st party on top of Claude Code

OpenClaw is basically banned from Claude ¯_(ツ)_/¯ Claude Code has Telegram support.. so what if we just, made it always stay on? turns ou...

Reddit - Artificial Intelligence · 1 min ·
Anthropic Teams Up With Its Rivals to Keep AI From Hacking Everything
Llms

Anthropic Teams Up With Its Rivals to Keep AI From Hacking Everything

The AI lab's Project Glasswing will bring together Apple, Google, and more than 45 other organizations. They'll use the new Claude Mythos...

Wired - AI · 7 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime