[2504.18880] Reshaping MOFs text mining with a dynamic multi-agents framework of large language model
Summary
The paper presents MOFh6, a large language model system that enhances the extraction of synthesis conditions for metal-organic frameworks (MOFs) from literature, achieving high accuracy and efficiency.
Why It Matters
This research addresses the challenge of synthesizing metal-organic frameworks by automating the extraction of relevant information from scattered literature. By improving data accessibility and accuracy, it can significantly accelerate materials discovery and experimental design in the field of materials science.
Key Takeaways
- MOFh6 achieves 99% extraction accuracy for MOF synthesis data.
- The system processes full texts in under 10 seconds, enhancing research efficiency.
- Real-time extraction replaces static database lookups, enabling scalable data-driven research.
- The framework resolves 94.1% of abbreviation cases, improving clarity in scientific communication.
- This innovation supports accelerated development of practical synthesis protocols.
Computer Science > Artificial Intelligence arXiv:2504.18880 (cs) [Submitted on 26 Apr 2025 (v1), last revised 21 Feb 2026 (this version, v4)] Title:Reshaping MOFs text mining with a dynamic multi-agents framework of large language model Authors:Zuhong Lin, Daoyuan Ren, Kai Ran, Jing Sun, Songlin Yu, Xuefeng Bai, Xiaotian Huang, Haiyang He, Pengxu Pan, Ying Fang, Zhanglin Li, Haipu Li, Jingjing Yao View a PDF of the paper titled Reshaping MOFs text mining with a dynamic multi-agents framework of large language model, by Zuhong Lin and 12 other authors View PDF HTML (experimental) Abstract:Accurately identifying the synthesis conditions of metal-organic frameworks (MOFs) is essential for guiding experimental design, yet remains challenging because relevant information in the literature is often scattered, inconsistent, and difficult to interpret. We present MOFh6, a large language model driven system that reads raw articles or crystal codes and converts them into standardized synthesis tables. It links related descriptions across paragraphs, unifies ligand abbreviations with full names, and outputs structured parameters ready for use. MOFh6 achieved 99% extraction accuracy, resolved 94.1% of abbreviation cases across five major publishers, and maintained a precision of 0.93 +/- 0.01. Processing a full text takes 9.6 s, locating synthesis descriptions 36 s, with 100 papers processed for USD 4.24. By replacing static database lookups with real-time extraction, MOFh6 reshapes M...