[2601.03127] Unified Thinker: A General Reasoning Modular Core for Image Generation
About this article
Abstract page for arXiv paper 2601.03127: Unified Thinker: A General Reasoning Modular Core for Image Generation
Computer Science > Computer Vision and Pattern Recognition arXiv:2601.03127 (cs) [Submitted on 6 Jan 2026 (v1), last revised 3 Apr 2026 (this version, v2)] Title:Unified Thinker: A General Reasoning Modular Core for Image Generation Authors:Sashuai Zhou, Qiang Zhou, Jijin Hu, Hanqing Yang, Yue Cao, Junpeng Ma, Yinchao Ma, Jun Song, Tiezheng Ge, Cheng Yu, Bo Zheng, Zhou Zhao View a PDF of the paper titled Unified Thinker: A General Reasoning Modular Core for Image Generation, by Sashuai Zhou and 11 other authors View PDF HTML (experimental) Abstract:Despite impressive progress in high-fidelity image synthesis, generative models still struggle with logic-intensive instruction following, exposing a persistent reasoning--execution gap. Meanwhile, closed-source systems (e.g., Nano Banana) have demonstrated strong reasoning-driven image generation, highlighting a substantial gap to current open-source models. We argue that closing this gap requires not merely better visual generators, but executable reasoning: decomposing high-level intents into grounded, verifiable plans that directly steer the generative process. To this end, we propose Unified Thinker, a task-agnostic reasoning architecture for general image generation, designed as a unified planning core that can plug into diverse generators and workflows. Unified Thinker decouples a dedicated Thinker from the image Generator, enabling modular upgrades of reasoning without retraining the entire generative model. We further i...