Poster
in
Workshop: Deep Generative Model in Machine Learning: Theory, Principle and Efficacy
Gumbel-Softmax Score and Flow Matching for Discrete Biological Sequence Generation
Sophia Tang · Yinuo Zhang · Alexander Tong · Pranam Chatterjee
Keywords: [ protein design ] [ Flow matching ] [ DNA design ] [ Gumbel-Softmax ]
We introduce Gumbel-Softmax Score and Flow Matching, a generative framework that relies on a novel Gumbel-Softmax interpolation between smooth categorical distributions to one concentrated at a single vertex by defining a time-dependent temperature parameter. Using this interpolant, we explore Gumbel-Softmax Flow Matching by deriving a parameterized velocity field transports smooth categorical distributions to the vertices of the simplex. We alternatively present Gumbel-Softmax Score Matching which learns to regress the gradient of the probability density. Our approach enables controllable generation with tunable temperatures and stochastic Gumbel noise during inference, enabling efficient de novo sequence design. Our experiments demonstrate state-of-the-art performance in conditional DNA promoter design and strong results in de novo sequence-only protein generation.