Poster
in
Workshop: Generative and Experimental Perspectives for Biomolecular Design
How well do generative protein models generate?
Han Spinner · Aaron Kollasch · Debora Marks
Protein design relies critically on the generation of plausible sequences. Yet, the efficacy of many common model architectures from simple interpretable models, like position-specific scoring matrix (PSSM) and direct couplings analysis (DCA), to newer and less interpretable models, like variational autoencoders (VAEs), autoregressive large language models (AR-LLMs) and flow matching (FM), for sequence sampling remains uncertain. While some models offer unique sequence generation methods, issues such as mode collapse, generation of nonsensical repeats, and protein truncations persist. Trusted methods like Gibbs sampling are often preferred for their reliability, but can be computationally expensive. This paper addresses the need to evaluate the performance and limitations of different generation methods from protein models, considering dependencies on multiple sequence alignment (MSA) depth and available sequence diversity. We propose rigorous evaluation methods and metrics to assess sequence generation, aiming to guide design decisions and inform the development of future model and sampling techniques for protein design applications.