Hit Expansion via Localized Exploration of Synthesizable Chemical Space
Abstract
Generative models for drug design which directly produce synthetic pathways have gained significant popularity due to their ability to constrain the search space to synthetically accessible molecules. However, existing methods have focused primarily on de novo molecular design, and rarely start the generation process from known binders. In this paper, we present HELiX: a template-based GFlowNet framework for localized exploration of chemical space. HELiX first learns to deconstruct a given hit by reversing selected reaction steps, and then performs forward synthesis in a manner that preserves synthetic tractability. Our approach demonstrates strong performance in efficiently identifying diverse, high-scoring analogs of known binders, and addresses the challenge of sample efficiency in GFlowNets by incorporating a Bayesian optimization loop which effectively balances exploration and exploitation. We also show that local exploration is inherently robust to noisy oracle evaluations, a common problem in drug development when using in silico predictors of binding affinity.