[2603.26680] AlpsBench: An LLM Personalization Benchmark for Real-Dialogue Memorization and Preference Alignment
About this article
Abstract page for arXiv paper 2603.26680: AlpsBench: An LLM Personalization Benchmark for Real-Dialogue Memorization and Preference Alignment
Computer Science > Computation and Language arXiv:2603.26680 (cs) [Submitted on 9 Mar 2026] Title:AlpsBench: An LLM Personalization Benchmark for Real-Dialogue Memorization and Preference Alignment Authors:Jianfei Xiao, Xiang Yu, Chengbing Wang, Wuqiang Zheng, Xinyu Lin, Kaining Liu, Hongxun Ding, Yang Zhang, Wenjie Wang, Fuli Feng, Xiangnan He View a PDF of the paper titled AlpsBench: An LLM Personalization Benchmark for Real-Dialogue Memorization and Preference Alignment, by Jianfei Xiao and 10 other authors View PDF HTML (experimental) Abstract:As Large Language Models (LLMs) evolve into lifelong AI assistants, LLM personalization has become a critical frontier. However, progress is currently bottlenecked by the absence of a gold-standard evaluation benchmark. Existing benchmarks either overlook personalized information management that is critical for personalization or rely heavily on synthetic dialogues, which exhibit an inherent distribution gap from real-world dialogue. To bridge this gap, we introduce AlpsBench, An LLM PerSonalization benchmark derived from real-world human-LLM dialogues. AlpsBench comprises 2,500 long-term interaction sequences curated from WildChat, paired with human-verified structured memories that encapsulate both explicit and implicit personalization signals. We define four pivotal tasks - personalized information extraction, updating, retrieval, and utilization - and establish protocols to evaluate the entire lifecycle of memory management. ...