[2603.19500] Teaching an Agent to Sketch One Part at a Time
About this article
Abstract page for arXiv paper 2603.19500: Teaching an Agent to Sketch One Part at a Time
Computer Science > Artificial Intelligence arXiv:2603.19500 (cs) [Submitted on 19 Mar 2026] Title:Teaching an Agent to Sketch One Part at a Time Authors:Xiaodan Du, Ruize Xu, David Yunis, Yael Vinker, Greg Shakhnarovich View a PDF of the paper titled Teaching an Agent to Sketch One Part at a Time, by Xiaodan Du and 4 other authors View PDF HTML (experimental) Abstract:We develop a method for producing vector sketches one part at a time. To do this, we train a multi-modal language model-based agent using a novel multi-turn process-reward reinforcement learning following supervised fine-tuning. Our approach is enabled by a new dataset we call ControlSketch-Part, containing rich part-level annotations for sketches, obtained using a novel, generic automatic annotation pipeline that segments vector sketches into semantic parts and assigns paths to parts with a structured multi-stage labeling process. Our results indicate that incorporating structured part-level data and providing agent with the visual feedback through the process enables interpretable, controllable, and locally editable text-to-vector sketch generation. Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG) Cite as: arXiv:2603.19500 [cs.AI] (or arXiv:2603.19500v1 [cs.AI] for this version) https://doi.org/10.48550/arXiv.2603.19500 Focus to learn more arXiv-issued DOI via DataCite (pending registration) Submission history From: Xi...