[2602.20812] Qwen-BIM: developing large language model for BIM-based design with domain-specific benchmark and dataset
Summary
The paper presents Qwen-BIM, a large language model tailored for BIM-based design, introducing a domain-specific benchmark and dataset that enhance LLM performance in this area.
Why It Matters
As the construction industry increasingly adopts digital transformation, the development of specialized tools like Qwen-BIM is crucial. This research addresses the limitations of general LLMs in domain-specific tasks, providing a foundation for improved BIM applications. The proposed benchmark and dataset could significantly advance the integration of AI in construction design processes.
Key Takeaways
- Qwen-BIM introduces a benchmark for evaluating LLMs in BIM-based design.
- The model achieves a 21.0% improvement in performance over base LLMs.
- A fine-tuning strategy is proposed to adapt LLMs specifically for BIM tasks.
- The research highlights the inadequacy of general LLMs for specialized applications.
- Qwen-BIM's performance is comparable to much larger models with significantly fewer parameters.
Computer Science > Artificial Intelligence arXiv:2602.20812 (cs) [Submitted on 24 Feb 2026] Title:Qwen-BIM: developing large language model for BIM-based design with domain-specific benchmark and dataset Authors:Jia-Rui Lin, Yun-Hong Cai, Xiang-Rui Ni, Shaojie Zhou, Peng Pan View a PDF of the paper titled Qwen-BIM: developing large language model for BIM-based design with domain-specific benchmark and dataset, by Jia-Rui Lin and 4 other authors View PDF Abstract:As the construction industry advances toward digital transformation, BIM (Building Information Modeling)-based design has become a key driver supporting intelligent construction. Despite Large Language Models (LLMs) have shown potential in promoting BIM-based design, the lack of specific datasets and LLM evaluation benchmarks has significantly hindered the performance of LLMs. Therefore, this paper addresses this gap by proposing: 1) an evaluation benchmark for BIM-based design together with corresponding quantitative indicators to evaluate the performance of LLMs, 2) a method for generating textual data from BIM and constructing corresponding BIM-derived datasets for LLM evaluation and fine-tuning, and 3) a fine-tuning strategy to adapt LLMs for BIM-based design. Results demonstrate that the proposed domain-specific benchmark effectively and comprehensively assesses LLM capabilities, highlighting that general LLMs are still incompetent for domain-specific tasks. Meanwhile, with the proposed benchmark and datasets,...