[2602.20812] Qwen-BIM: developing large language model for BIM-based design with domain-specific benchmark and dataset

[2602.20812] Qwen-BIM: developing large language model for BIM-based design with domain-specific benchmark and dataset

arXiv - AI 4 min read Article

Summary

The paper presents Qwen-BIM, a large language model tailored for BIM-based design, introducing a domain-specific benchmark and dataset that enhance LLM performance in this area.

Why It Matters

As the construction industry increasingly adopts digital transformation, the development of specialized tools like Qwen-BIM is crucial. This research addresses the limitations of general LLMs in domain-specific tasks, providing a foundation for improved BIM applications. The proposed benchmark and dataset could significantly advance the integration of AI in construction design processes.

Key Takeaways

  • Qwen-BIM introduces a benchmark for evaluating LLMs in BIM-based design.
  • The model achieves a 21.0% improvement in performance over base LLMs.
  • A fine-tuning strategy is proposed to adapt LLMs specifically for BIM tasks.
  • The research highlights the inadequacy of general LLMs for specialized applications.
  • Qwen-BIM's performance is comparable to much larger models with significantly fewer parameters.

Computer Science > Artificial Intelligence arXiv:2602.20812 (cs) [Submitted on 24 Feb 2026] Title:Qwen-BIM: developing large language model for BIM-based design with domain-specific benchmark and dataset Authors:Jia-Rui Lin, Yun-Hong Cai, Xiang-Rui Ni, Shaojie Zhou, Peng Pan View a PDF of the paper titled Qwen-BIM: developing large language model for BIM-based design with domain-specific benchmark and dataset, by Jia-Rui Lin and 4 other authors View PDF Abstract:As the construction industry advances toward digital transformation, BIM (Building Information Modeling)-based design has become a key driver supporting intelligent construction. Despite Large Language Models (LLMs) have shown potential in promoting BIM-based design, the lack of specific datasets and LLM evaluation benchmarks has significantly hindered the performance of LLMs. Therefore, this paper addresses this gap by proposing: 1) an evaluation benchmark for BIM-based design together with corresponding quantitative indicators to evaluate the performance of LLMs, 2) a method for generating textual data from BIM and constructing corresponding BIM-derived datasets for LLM evaluation and fine-tuning, and 3) a fine-tuning strategy to adapt LLMs for BIM-based design. Results demonstrate that the proposed domain-specific benchmark effectively and comprehensively assesses LLM capabilities, highlighting that general LLMs are still incompetent for domain-specific tasks. Meanwhile, with the proposed benchmark and datasets,...

Related Articles

[2603.17839] How do LLMs Compute Verbal Confidence
Llms

[2603.17839] How do LLMs Compute Verbal Confidence

Abstract page for arXiv paper 2603.17839: How do LLMs Compute Verbal Confidence

arXiv - AI · 4 min ·
[2603.15970] 100x Cost & Latency Reduction: Performance Analysis of AI Query Approximation using Lightweight Proxy Models
Llms

[2603.15970] 100x Cost & Latency Reduction: Performance Analysis of AI Query Approximation using Lightweight Proxy Models

Abstract page for arXiv paper 2603.15970: 100x Cost & Latency Reduction: Performance Analysis of AI Query Approximation using Lightweight...

arXiv - AI · 4 min ·
[2603.10062] Multi-Agent Memory from a Computer Architecture Perspective: Visions and Challenges Ahead
Llms

[2603.10062] Multi-Agent Memory from a Computer Architecture Perspective: Visions and Challenges Ahead

Abstract page for arXiv paper 2603.10062: Multi-Agent Memory from a Computer Architecture Perspective: Visions and Challenges Ahead

arXiv - AI · 3 min ·
[2603.09085] Not All News Is Equal: Topic- and Event-Conditional Sentiment from Finetuned LLMs for Aluminum Price Forecasting
Llms

[2603.09085] Not All News Is Equal: Topic- and Event-Conditional Sentiment from Finetuned LLMs for Aluminum Price Forecasting

Abstract page for arXiv paper 2603.09085: Not All News Is Equal: Topic- and Event-Conditional Sentiment from Finetuned LLMs for Aluminum ...

arXiv - AI · 4 min ·
More in Llms: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime