[2510.10102] PANTHER: Generative Pretraining Beyond Language for Sequential User Behavior Modeling
About this article
Abstract page for arXiv paper 2510.10102: PANTHER: Generative Pretraining Beyond Language for Sequential User Behavior Modeling
Computer Science > Machine Learning arXiv:2510.10102 (cs) [Submitted on 11 Oct 2025 (v1), last revised 30 Mar 2026 (this version, v2)] Title:PANTHER: Generative Pretraining Beyond Language for Sequential User Behavior Modeling Authors:Guilin Li, Yun Zhang, Xiuyuan Chen, Chengqi Li, Bo Wang, Linghe Kong, Wenjia Wang, Weiran Huang, Matthias Hwai Yong Tan View a PDF of the paper titled PANTHER: Generative Pretraining Beyond Language for Sequential User Behavior Modeling, by Guilin Li and 8 other authors View PDF HTML (experimental) Abstract:Large language models (LLMs) have shown that generative pretraining can distill vast world knowledge into compact token representations. While LLMs encapsulate extensive world knowledge, they remain limited in modeling the behavioral knowledge contained within user interaction histories. User behavior forms a distinct modality, where each action, defined by multi-dimensional attributes such as time, context, and transaction type, constitutes a behavioral token. Modeling these high-cardinality sequences is challenging, and discriminative models often falter under limited supervision. To bridge this gap, we extend generative pretraining to user behavior, learning transferable representations from unlabeled behavioral data analogous to how LLMs learn from text. We present PANTHER, a hybrid generative-discriminative framework that unifies user behavior pretraining and downstream adaptation, enabling large-scale sequential user representation l...