[2604.05181] General Multimodal Protein Design Enables DNA-Encoding of Chemistry
About this article
Abstract page for arXiv paper 2604.05181: General Multimodal Protein Design Enables DNA-Encoding of Chemistry
Computer Science > Machine Learning arXiv:2604.05181 (cs) [Submitted on 6 Apr 2026] Title:General Multimodal Protein Design Enables DNA-Encoding of Chemistry Authors:Jarrid Rector-Brooks, Théophile Lambert, Marta Skreta, Daniel Roth, Yueming Long, Zi-Qi Li, Xi Zhang, Miruna Cretu, Francesca-Zhoufan Li, Tanvi Ganapathy, Emily Jin, Avishek Joey Bose, Jason Yang, Kirill Neklyudov, Yoshua Bengio, Alexander Tong, Frances H. Arnold, Cheng-Hao Liu View a PDF of the paper titled General Multimodal Protein Design Enables DNA-Encoding of Chemistry, by Jarrid Rector-Brooks and 17 other authors View PDF Abstract:Evolution is an extraordinary engine for enzymatic diversity, yet the chemistry it has explored remains a narrow slice of what DNA can encode. Deep generative models can design new proteins that bind ligands, but none have created enzymes without pre-specifying catalytic residues. We introduce DISCO (DIffusion for Sequence-structure CO-design), a multimodal model that co-designs protein sequence and 3D structure around arbitrary biomolecules, as well as inference-time scaling methods that optimize objectives across both modalities. Conditioned solely on reactive intermediates, DISCO designs diverse heme enzymes with novel active-site geometries. These enzymes catalyze new-to-nature carbene-transfer reactions, including alkene cyclopropanation, spirocyclopropanation, B-H, and C(sp$^3$)-H insertions, with high activities exceeding those of engineered enzymes. Random mutagenesis ...