[2603.27950] Scaling Atomistic Protein Binder Design with Generative Pretraining and Test-Time Compute
About this article
Abstract page for arXiv paper 2603.27950: Scaling Atomistic Protein Binder Design with Generative Pretraining and Test-Time Compute
Computer Science > Machine Learning arXiv:2603.27950 (cs) [Submitted on 30 Mar 2026] Title:Scaling Atomistic Protein Binder Design with Generative Pretraining and Test-Time Compute Authors:Kieran Didi, Zuobai Zhang, Guoqing Zhou, Danny Reidenbach, Zhonglin Cao, Sooyoung Cha, Tomas Geffner, Christian Dallago, Jian Tang, Michael M. Bronstein, Martin Steinegger, Emine Kucukbenli, Arash Vahdat, Karsten Kreis View a PDF of the paper titled Scaling Atomistic Protein Binder Design with Generative Pretraining and Test-Time Compute, by Kieran Didi and 13 other authors View PDF HTML (experimental) Abstract:Protein interaction modeling is central to protein design, which has been transformed by machine learning with applications in drug discovery and beyond. In this landscape, structure-based de novo binder design is cast as either conditional generative modeling or sequence optimization via structure predictors ("hallucination"). We argue that this is a false dichotomy and propose Proteina-Complexa, a novel fully atomistic binder generation method unifying both paradigms. We extend recent flow-based latent protein generation architectures and leverage the domain-domain interactions of monomeric computationally predicted protein structures to construct Teddymer, a new large-scale dataset of synthetic binder-target pairs for pretraining. Combined with high-quality experimental multimers, this enables training a strong base model. We then perform inference-time optimization with this g...