[2603.03303] HumanLM: Simulating Users with State Alignment Beats Response Imitation
About this article
Abstract page for arXiv paper 2603.03303: HumanLM: Simulating Users with State Alignment Beats Response Imitation
Computer Science > Computation and Language arXiv:2603.03303 (cs) [Submitted on 7 Feb 2026] Title:HumanLM: Simulating Users with State Alignment Beats Response Imitation Authors:Shirley Wu, Evelyn Choi, Arpandeep Khatua, Zhanghan Wang, Joy He-Yueya, Tharindu Cyril Weerasooriya, Wei Wei, Diyi Yang, Jure Leskovec, James Zou View a PDF of the paper titled HumanLM: Simulating Users with State Alignment Beats Response Imitation, by Shirley Wu and 9 other authors View PDF HTML (experimental) Abstract:Large Language Models (LLMs) are increasingly used to simulate how specific users respond to a given context, enabling more user-centric applications that rely on user feedback. However, existing user simulators mostly imitate surface-level patterns and language styles, which fail to reflect the underlying states of real users (e.g., beliefs and emotions). To address these limitations, we propose a novel training framework, HumanLM, which builds user simulators that accurately reflect real users. Our key insight is that, in addition to generating responses, the model should generate natural-language latent states that align with ground-truth responses through reinforcement learning. These latent states correspond to a set of psychologically grounded state dimensions that drive how real users respond. HumanLM further synthesizes these aligned latent states into responses that accurately represent real users. For extensive evaluation, we develop Humanual, a comprehensive benchmark for...