Machine Learning Ai Agents Data Science

[2602.15012] Cold-Start Personalization via Training-Free Priors from Structured World Models

arXiv - AI February 17, 2026 4 min read Article

Summary

This paper presents Pep, a novel approach for cold-start personalization that utilizes structured world models to improve user preference elicitation without extensive training.

Why It Matters

Cold-start personalization is crucial in AI systems where user data is scarce. This research offers a more efficient method for understanding user preferences, potentially enhancing user experience across various applications, from medical to social reasoning.

Key Takeaways

Pep achieves 80.8% alignment with user preferences, outperforming traditional reinforcement learning methods.
The framework requires significantly fewer interactions (3-5x less) to gather user preferences.
Pep adapts its questioning strategy based on user responses, improving personalization accuracy.

Computer Science > Computation and Language arXiv:2602.15012 (cs) [Submitted on 16 Feb 2026] Title:Cold-Start Personalization via Training-Free Priors from Structured World Models Authors:Avinandan Bose, Shuyue Stella Li, Faeze Brahman, Pang Wei Koh, Simon Shaolei Du, Yulia Tsvetkov, Maryam Fazel, Lin Xiao, Asli Celikyilmaz View a PDF of the paper titled Cold-Start Personalization via Training-Free Priors from Structured World Models, by Avinandan Bose and 8 other authors View PDF HTML (experimental) Abstract:Cold-start personalization requires inferring user preferences through interaction when no user-specific historical data is available. The core challenge is a routing problem: each task admits dozens of preference dimensions, yet individual users care about only a few, and which ones matter depends on who is asking. With a limited question budget, asking without structure will miss the dimensions that matter. Reinforcement learning is the natural formulation, but in multi-turn settings its terminal reward fails to exploit the factored, per-criterion structure of preference data, and in practice learned policies collapse to static question sequences that ignore user responses. We propose decomposing cold-start elicitation into offline structure learning and online Bayesian inference. Pep (Preference Elicitation with Priors) learns a structured world model of preference correlations offline from complete profiles, then performs training-free Bayesian inference online to...

Read Original Article