Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Generative and Experimental Perspectives for Biomolecular Design

Demystify the Secret Function in Protein Sequence via Conditional Diffusion Models

Yaoyao Xu · Xuxi Chen · Tong Wang · Huan He · Tianlong Chen · Manolis Kellis


Abstract:

Generating accurate functional annotations for protein sequences presents a significant challenge, especially when dealing with lengthy captions that contain concise descriptions. Recent advancements in diffusion models have shown impressive empirical performance in sequence-to-sequence generation tasks. In this paper, we propose ProCDM, a conditional diffusion generative model that utilizes protein sequence representations to generate functional descriptions for proteins. ProCDM employs a contrastive learning framework to extract and align protein embeddings with their functionality and then generates functional descriptions by denoising within the continuous embedding space. Our approach, ProCDM, demonstrates the capability to generate a wide range of functional descriptions for proteins that align with their actual functionality. Comprehensive experiments are conducted on the EC-Caption datasets to evaluate the effectiveness of our proposal.

Chat is not available.