[2509.21150] CAD-Tokenizer: Towards Text-based CAD Prototyping via

[2509.21150] CAD-Tokenizer: Towards Text-based CAD Prototyping via Modality-Specific Tokenization

arXiv - Machine Learning March 05, 2026 4 min read

About this article

Abstract page for arXiv paper 2509.21150: CAD-Tokenizer: Towards Text-based CAD Prototyping via Modality-Specific Tokenization

Computer Science > Machine Learning arXiv:2509.21150 (cs) [Submitted on 25 Sep 2025 (v1), last revised 3 Mar 2026 (this version, v2)] Title:CAD-Tokenizer: Towards Text-based CAD Prototyping via Modality-Specific Tokenization Authors:Ruiyu Wang, Shizhao Sun, Weijian Ma, Jiang Bian View a PDF of the paper titled CAD-Tokenizer: Towards Text-based CAD Prototyping via Modality-Specific Tokenization, by Ruiyu Wang and 3 other authors View PDF HTML (experimental) Abstract:Computer-Aided Design (CAD) is a foundational component of industrial prototyping, where models are defined not by raw coordinates but by construction sequences such as sketches and extrusions. This sequential structure enables both efficient prototype initialization and subsequent editing. Text-guided CAD prototyping, which unifies Text-to-CAD generation and CAD editing, has the potential to streamline the entire design pipeline. However, prior work has not explored this setting, largely because standard large language model (LLM) tokenizers decompose CAD sequences into natural-language word pieces, failing to capture primitive-level CAD semantics and hindering attention modules from modeling geometric structure. We conjecture that a multimodal tokenization strategy, aligned with CAD's primitive and structural nature, can provide more effective representations. To this end, we propose CAD-Tokenizer, a framework that represents CAD data with modality-specific tokens using a sequence-based VQ-VAE with primitive-l...

Originally published on March 05, 2026. Curated by AI News.

Llms

World models will be the next big thing, bye-bye LLMs

Was at Nvidia's GTC conference recently and honestly, it was one of the most eye-opening events I've attended in a while. There was a lot...

Reddit - Artificial Intelligence · 1 min · about 1 hour ago

Machine Learning

[D] Got my first offer after months of searching — below posted range, contract-to-hire, and worried it may pause my search. Do I take it?

I could really use some outside perspective. I’m a senior ML/CV engineer in Canada with about 5–6 years across research and industry. Mas...

Reddit - Machine Learning · 1 min · about 2 hours ago

Machine Learning

[Research] AI training is bad, so I started an research

Hello, I started researching about AI training Q:Why? R: Because AI training is bad right now. Q: What do you mean its bad? R: Like when ...

Reddit - Machine Learning · 1 min · about 2 hours ago

Machine Learning

[P] Unix philosophy for ML pipelines: modular, swappable stages with typed contracts

We built an open-source prototype that applies Unix philosophy to retrieval pipelines. Each stage (PII redaction, chunking, dedup, embedd...

Reddit - Machine Learning · 1 min · about 5 hours ago

[2509.21150] CAD-Tokenizer: Towards Text-based CAD Prototyping via Modality-Specific Tokenization

About this article

Related Articles

World models will be the next big thing, bye-bye LLMs

[D] Got my first offer after months of searching — below posted range, contract-to-hire, and worried it may pause my search. Do I take it?

[Research] AI training is bad, so I started an research

[P] Unix philosophy for ML pipelines: modular, swappable stages with typed contracts

No comments

Stay updated with AI News