[2604.05072] Hierarchical SVG Tokenization: Learning Compact Visual Programs for Scalable Vector Graphics Modeling
About this article
Abstract page for arXiv paper 2604.05072: Hierarchical SVG Tokenization: Learning Compact Visual Programs for Scalable Vector Graphics Modeling
Computer Science > Machine Learning arXiv:2604.05072 (cs) [Submitted on 6 Apr 2026] Title:Hierarchical SVG Tokenization: Learning Compact Visual Programs for Scalable Vector Graphics Modeling Authors:Ximing Xing, Ziteng Xue, Zhenxi Li, Weicong Liang, Linqing Wang, Zhantao Yang, Tiankai Hang, Zijin Yin, Qinglin Lu, Chunyu Wang, Qian Yu View a PDF of the paper titled Hierarchical SVG Tokenization: Learning Compact Visual Programs for Scalable Vector Graphics Modeling, by Ximing Xing and 10 other authors View PDF HTML (experimental) Abstract:Recent large language models have shifted SVG generation from differentiable rendering optimization to autoregressive program synthesis. However, existing approaches still rely on generic byte-level tokenization inherited from natural language processing, which poorly reflects the geometric structure of vector graphics. Numerical coordinates are fragmented into discrete symbols, destroying spatial relationships and introducing severe token redundancy, often leading to coordinate hallucination and inefficient long-sequence generation. To address these challenges, we propose HiVG, a hierarchical SVG tokenization framework tailored for autoregressive vector graphics generation. HiVG decomposes raw SVG strings into structured \textit{atomic tokens} and further compresses executable command--parameter groups into geometry-constrained \textit{segment tokens}, substantially improving sequence efficiency while preserving syntactic validity. To fu...