[2603.20192] LumosX: Relate Any Identities with Their Attributes for Personalized Video Generation
About this article
Abstract page for arXiv paper 2603.20192: LumosX: Relate Any Identities with Their Attributes for Personalized Video Generation
Computer Science > Computer Vision and Pattern Recognition arXiv:2603.20192 (cs) [Submitted on 20 Mar 2026] Title:LumosX: Relate Any Identities with Their Attributes for Personalized Video Generation Authors:Jiazheng Xing, Fei Du, Hangjie Yuan, Pengwei Liu, Hongbin Xu, Hai Ci, Ruigang Niu, Weihua Chen, Fan Wang, Yong Liu View a PDF of the paper titled LumosX: Relate Any Identities with Their Attributes for Personalized Video Generation, by Jiazheng Xing and 9 other authors View PDF HTML (experimental) Abstract:Recent advances in diffusion models have significantly improved text-to-video generation, enabling personalized content creation with fine-grained control over both foreground and background elements. However, precise face-attribute alignment across subjects remains challenging, as existing methods lack explicit mechanisms to ensure intra-group consistency. Addressing this gap requires both explicit modeling strategies and face-attribute-aware data resources. We therefore propose LumosX, a framework that advances both data and model design. On the data side, a tailored collection pipeline orchestrates captions and visual cues from independent videos, while multimodal large language models (MLLMs) infer and assign subject-specific dependencies. These extracted relational priors impose a finer-grained structure that amplifies the expressive control of personalized video generation and enables the construction of a comprehensive benchmark. On the modeling side, Relation...