[2603.20470] DiffGraph: An Automated Agent-driven Model Merging Framework for In-the-Wild Text-to-Image Generation
About this article
Abstract page for arXiv paper 2603.20470: DiffGraph: An Automated Agent-driven Model Merging Framework for In-the-Wild Text-to-Image Generation
Computer Science > Artificial Intelligence arXiv:2603.20470 (cs) [Submitted on 20 Mar 2026] Title:DiffGraph: An Automated Agent-driven Model Merging Framework for In-the-Wild Text-to-Image Generation Authors:Zhuoling Li, Hossein Rahmani, Jiarui Zhang, Yu Xue, Majid Mirmehdi, Jason Kuen, Jiuxiang Gu, Jun Liu View a PDF of the paper titled DiffGraph: An Automated Agent-driven Model Merging Framework for In-the-Wild Text-to-Image Generation, by Zhuoling Li and Hossein Rahmani and Jiarui Zhang and Yu Xue and Majid Mirmehdi and Jason Kuen and Jiuxiang Gu and Jun Liu View PDF HTML (experimental) Abstract:The rapid growth of the text-to-image (T2I) community has fostered a thriving online ecosystem of expert models, which are variants of pretrained diffusion models specialized for diverse generative abilities. Yet, existing model merging methods remain limited in fully leveraging abundant online expert resources and still struggle to meet diverse in-the-wild user needs. We present DiffGraph, a novel agent-driven graph-based model merging framework, which automatically harnesses online experts and flexibly merges them for diverse user needs. Our DiffGraph constructs a scalable graph and organizes ever-expanding online experts within it through node registration and calibration. Then, DiffGraph dynamically activates specific subgraphs based on user needs, enabling flexible combinations of different experts to achieve user-desired generation. Extensive experiments show the efficacy ...