[2507.07390] Learning Collective Variables from BioEmu with Time-Lagged Generation

[2507.07390] Learning Collective Variables from BioEmu with Time-Lagged Generation

arXiv - Machine Learning 4 min read Article

Summary

This article presents a novel framework, BioEmu-CV, for automatically learning collective variables (CVs) from molecular dynamics simulations, enhancing the understanding of rare events like protein folding.

Why It Matters

Understanding molecular dynamics is crucial for advancements in biochemistry and drug design. The proposed BioEmu-CV framework addresses the challenge of identifying effective collective variables, which can significantly improve simulation accuracy and efficiency in studying complex biological processes.

Key Takeaways

  • BioEmu-CV framework learns collective variables from molecular dynamics simulations.
  • The framework enhances the simulation of rare events like protein folding.
  • It promotes the encoding of long-term dynamics while filtering out noise.
  • Validated on fast-folding proteins, it shows practical applications in estimating free energy differences.
  • Provides a comprehensive benchmark for molecular learning collective variables.

Computer Science > Machine Learning arXiv:2507.07390 (cs) [Submitted on 10 Jul 2025 (v1), last revised 22 Feb 2026 (this version, v4)] Title:Learning Collective Variables from BioEmu with Time-Lagged Generation Authors:Seonghyun Park, Kiyoung Seong, Soojung Yang, Rafael Gómez-Bombarelli, Sungsoo Ahn View a PDF of the paper titled Learning Collective Variables from BioEmu with Time-Lagged Generation, by Seonghyun Park and 3 other authors View PDF HTML (experimental) Abstract:Molecular dynamics is crucial for understanding molecular systems but its applicability is often limited by the vast timescales of rare events like protein folding. Enhanced sampling techniques overcome this by accelerating the simulation along key reaction pathways, which are defined by collective variables (CVs). However, identifying effective CVs that capture the slow, macroscopic dynamics of a system remains a major bottleneck. This work proposes a novel framework coined BioEmu-CV that learns these essential CVs automatically from BioEmu, a recently proposed foundation model for generating protein equilibrium samples. In particular, we re-purpose BioEmu to learn time-lagged generation conditioned on the learned CV, i.e., predict the distribution of molecular states after a certain amount of time. This training process promotes the CV to encode only the slow, long-term information while disregarding fast, random fluctuations. We validate our learned CV on fast-folding proteins with two key applicatio...

Related Articles

Machine Learning

[D] Offering licensed Indian language speech datasets (with explicit contributor consent)

Hi everyone, I run a small data initiative where we collect speech datasets in multiple Indian languages directly from contributors who p...

Reddit - Machine Learning · 1 min ·
UMKC Announces New Master of Science in Artificial Intelligence
Ai Infrastructure

UMKC Announces New Master of Science in Artificial Intelligence

UMKC announces a new Master of Science in Artificial Intelligence program aimed at addressing workforce demand for AI expertise, set to l...

AI News - General · 4 min ·
Llms

[R] Looking for arXiv cs.LG endorser, inference monitoring using information geometry

Hi r/MachineLearning, I’m looking for an arXiv endorser in cs.LG for a paper on inference-time distribution shift detection for deployed ...

Reddit - Machine Learning · 1 min ·
Top 10 AI certifications and courses for 2026
Ai Startups

Top 10 AI certifications and courses for 2026

This article reviews the top 10 AI certifications and courses for 2026, highlighting their significance in a rapidly evolving field and t...

AI Events · 15 min ·
More in Machine Learning: This Week Guide Trending

No comments

No comments yet. Be the first to comment!

Stay updated with AI News

Get the latest news, tools, and insights delivered to your inbox.

Daily or weekly digest • Unsubscribe anytime