[2510.18900] Foundation Models for Discovery and Exploration in Chemical Space
About this article
Abstract page for arXiv paper 2510.18900: Foundation Models for Discovery and Exploration in Chemical Space
Physics > Chemical Physics arXiv:2510.18900 (physics) [Submitted on 20 Oct 2025 (v1), last revised 1 May 2026 (this version, v2)] Title:Foundation Models for Discovery and Exploration in Chemical Space Authors:Alexius Wadell, Anoushka Bhutani, Victor Azumah, Austin R. Ellis-Mohr, Andrew J. Stier, Kareem Hegazy, Alexander Brace, Hancheng Zhao, Celia Kelly, Anuj K. Nayak, Yuhan Chen, Dimitrios Simatos, Hongyi Lin, Murali Emani, Venkatram Vishwanath, Kevin Gering, Melisa Alkan, Tom Gibbs, Jack Wells, Wesley W. Qian, Richard C. Gerkin, Benjamin Amorelli, Alexander B. Wiltschko, Lav R. Varshney, Bharath Ramsundar, Karthik Duraisamy, Michael W. Mahoney, Arvind Ramanathan, Venkatasubramanian Viswanathan View a PDF of the paper titled Foundation Models for Discovery and Exploration in Chemical Space, by Alexius Wadell and 28 other authors View PDF Abstract:Accurate prediction of atomistic, thermodynamic, and kinetic properties from molecular structures underpins materials innovation. Existing computational and experimental approaches lack the scalability required to navigate chemical space efficiently. Scientific foundation models trained on large unlabelled datasets offer a path towards navigating chemical space across application domains. Here, we develop MIST, a family of molecular foundation models with up to an order of magnitude more parameters and data than prior works. Trained using a novel tokenizer, Smirk, which comprehensively captures nuclear, electronic, and geometric...