Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Learning Meaningful Representations of Life (LMRL) Workshop @ ICLR 2025

Learning to Predict Ensembles of Protein Conformations from Molecular Dynamics Simulation Trajectories

Bongjin Koo · Patrick Jiang · Soumya Dutta · I. Can Kazan · Banu Ozkan · Paul Kim · Abhishek Singharoy · Tristan Bepler


Abstract:

A group of heterogeneous conformations of a protein, also known as an ensembleof conformations, is a key to understanding protein functions. This is because many proteins aremechanical machines that perform tasks by changing their shapes. Nevertheless, the main focus ofprotein structure prediction from a sequence thus far has been to accurately predict a single structure,e.g., AlphaFold (AF) [Abramson et al. (2024)] and ESMFold [Lin et al. (2023)]. Recently, workson predicting multiple conformations by subsampling MSAs (multiple sequence alignments) [delAlamo et al. (2022)] or by clustering MSAs [Wayment-Steele et al. (2024)] were introduced. Whilethey can predict heterogeneous conformations, they are limited w.r.t. the diversity of predicted struc-tures as well as the trainability on data other than Protein Data Bank (PDB) [Berman et al. (2000)]structures, such as on molecular dynamics (MD) simulation trajectories. AlphaFlow [Jing et al.(2024)] overcame this limitation by incorporating a Flow Matching (FM) [Lipman et al. (2023)]framework with AlphaFold as a denoising model. Since an FM model can generate diverse samplesby transforming the initial samples from a prior distribution, AlphaFlow has a potential to generateensembles of conformations. The authors showed that it can be trained on MD trajectories and gen-erate physically feasible ensembles. In this paper, we look more closely into AlphaFlow’s ability onlearning MD ensembles that are generated using Temperature Replica Exchange Molecular Dynam-ics (T-REMD) [Qi et al. (2018)]. This is an exploratory study before improving its architecture forproposing our own model.

Chat is not available.