Skip to yearly menu bar Skip to main content

Workshop: Machine Learning for Drug Discovery (MLDD)

Deep sharpening of topological features for de novo protein design

Zander Harteveld · Joshua Southern · MichaĆ«l Defferrard · Andreas Loukas · Pierre Vandergheynst · Micheal Bronstein · Bruno Correia

Keywords: [ variational autoencoder ]


Computational \emph{de novo} protein design allows the exploration of uncharted areas of the protein structure and sequence spaces. Classical approaches to \emph{de novo} protein design involve an iterative process where the desired protein shape is outlined, then sampled for structural backbones and designed with low energy amino acid sequences. Despite numerous successes, inaccuracies within energy functions and sampling methods often lead to physically unrealistic protein backbones yielding sequences that fail to fold experimentally. Recently, deep neural networks have successfully been used to design novel protein folds from scratch by iteratively predicting a structure and optimizing the sequence until a target protein structure is reached. These methods work well under circumstances where distributions of physically realistic target protein backbones can be readily defined, but lack the ability to \emph{de novo} design loosely specified protein shapes. In fact, a major challenge for \emph{de novo} protein design is to generate "designable" protein structures for defined folds, including native and artificial ("dark matter") folds that can then be used to find low energetic sequences in a generic manner. Here, we automate the task of creating designable backbones using a variational autoencoder framework, termed \textsc{Genesis}, to denoise sketches of protein topological lattice models by sharpening their 2D representations in distance and angle feature maps. In conjunction with the trRosetta design framework, large pools of diverse sequences for different protein folds were generated for the maps. We found that the \textsc{Genesis}-trDesign framework generates native-like feature maps for known and dark matter protein folds. Ultimately, the \textsc{Genesis} framework addresses the protein backbone designability problem and could contribute to the \emph{de novo} design of structurally defined artificial proteins that can be tailored for novel functionalities.

Chat is not available.