Poster
in
Workshop: Integrating Generative and Experimental Platforms for Biomolecular Design

Efficient, Few-shot Directed Evolution with Energy Rank Alignment

Sebastian Ibarraran ⋅ Shriram Chennakesavalu ⋅ Frank Hu ⋅ Grant Rotskoff

Project Page [ OpenReview]

Abstract

Directed evolution is a powerful and widely used technique for protein engineering, and reducing the cost of iterated experimental observations has become a major priority for practitioners. A number of recent efforts to use machine-learning-based predictors to improve sequence selection have led to remarkable improvements in efficiency, but the sparse data at each experimental iteration restricts these approaches to extremely simple models. Adapting large-scale pre-trained protein language models using experimental data offers an alternative that we show productively leverages the strong inductive biases of the natural distribution of protein sequences to navigate high-dimensional, combinatorially large fitness landscapes. Our approach uses a general-purpose "post-training" algorithm grounded in statistical physics that employs quantitative experimental rankings to directly produce a sampler for diverse, high fitness sequences with fewer data points than competing methods.

Chat is not available.