[2509.21150] CAD-Tokenizer: Towards Text-based CAD Prototyping via Modality-Specific Tokenization
About this article
Abstract page for arXiv paper 2509.21150: CAD-Tokenizer: Towards Text-based CAD Prototyping via Modality-Specific Tokenization
Computer Science > Machine Learning arXiv:2509.21150 (cs) [Submitted on 25 Sep 2025 (v1), last revised 3 Mar 2026 (this version, v2)] Title:CAD-Tokenizer: Towards Text-based CAD Prototyping via Modality-Specific Tokenization Authors:Ruiyu Wang, Shizhao Sun, Weijian Ma, Jiang Bian View a PDF of the paper titled CAD-Tokenizer: Towards Text-based CAD Prototyping via Modality-Specific Tokenization, by Ruiyu Wang and 3 other authors View PDF HTML (experimental) Abstract:Computer-Aided Design (CAD) is a foundational component of industrial prototyping, where models are defined not by raw coordinates but by construction sequences such as sketches and extrusions. This sequential structure enables both efficient prototype initialization and subsequent editing. Text-guided CAD prototyping, which unifies Text-to-CAD generation and CAD editing, has the potential to streamline the entire design pipeline. However, prior work has not explored this setting, largely because standard large language model (LLM) tokenizers decompose CAD sequences into natural-language word pieces, failing to capture primitive-level CAD semantics and hindering attention modules from modeling geometric structure. We conjecture that a multimodal tokenization strategy, aligned with CAD's primitive and structural nature, can provide more effective representations. To this end, we propose CAD-Tokenizer, a framework that represents CAD data with modality-specific tokens using a sequence-based VQ-VAE with primitive-l...