ICLR Poster Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design

Poster

Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design

Chenyu Wang · Masatoshi Uehara · Yichun He · Amy Wang · Avantika Lal · Tommi Jaakkola · Sergey Levine · Aviv Regev · Hanchen Wang · Tommaso Biancalani

Hall 3 + Hall 2B #576

[ Abstract ] [ Project Page ]

Thu 24 Apr 7 p.m. PDT — 9:30 p.m. PDT

Abstract:

Recent studies have demonstrated the strong empirical performance of diffusion models on discrete sequences (i.e., discrete diffusion models) across domains such as natural language and biological sequence generation. For example, in the protein inverse folding task, where the goal is to generate a protein sequence from a given backbone structure, conditional diffusion models have achieved impressive results in generating "natural" sequences that fold back into the original structure. However, practical design tasks often require not only modeling a conditional distribution but also optimizing specific task objectives. For instance, in the inverse folding task, we may prefer proteins with high stability. To address this, we consider the scenario where we have pre-trained discrete diffusion models that can generate "natural" sequences, as well as reward models that map sequences to task objectives. We then formulate the reward maximization problem within discrete diffusion models, analogous to reinforcement learning (RL), while minimizing the KL divergence against pre-trained diffusion models to preserve naturalness. To solve this RL problem, we propose a novel algorithm that enables direct backpropagation of rewards through entire trajectories generated by diffusion models, by making the originally non-differentiable trajectories differentiable using the Gumbel-Softmax trick. Our theoretical analysis indicates that our approach can generate sequences that are both "natural" (i.e., have a high probability under a pre-trained model) and yield high rewards. While similar tasks have been recently explored in diffusion models for continuous domains, our work addresses unique algorithmic and theoretical challenges specific to discrete diffusion models, which arise from their foundation in continuous-time Markov chains rather than Brownian motion. Finally, we demonstrate the effectiveness of our algorithm in generating DNA and protein sequences that optimize enhancer activity and protein stability, respectively, important tasks for gene therapies and protein-based therapeutics. The code is available at https://github.com/ChenyuWang-Monica/DRAKES.

Live content is unavailable. Log in and register to view live content