Poster
in
Workshop: AI4MAT-ICLR-2025: AI for Accelerated Materials Design
It Takes Two to Tango: Directly Optimizing for Constrained Synthesizability in Generative Molecular Design
Jeff Guo · Philippe Schwaller
Keywords: [ drug discovery ] [ constrained synthesizability ] [ generative design ]
Constrained synthesizability is an unaddressed challenge in generative molecular design. In particular, designing molecules satisfying multi-parameter optimization objectives, while simultaneously being synthesizable and enforcing the presence of specific building blocks in the synthesis. This is practically important for molecule re-purposing, sustainability, and efficiency. In this work, we propose a novel reward function called TANimoto Group Overlap (TANGO), which uses chemistry principles to transform a sparse reward function into a dense reward function -- crucial for reinforcement learning (RL). TANGO can augment molecular generative models to directly optimize for constrained synthesizability while simultaneously optimizing for other properties relevant to drug discovery. Our framework is general and addresses starting-material, intermediate, and divergent synthesis constraints. Contrary to many existing works in the field, we show that incentivizing a general-purpose model with RL is a productive approach to navigating challenging synthesizability optimization scenarios. We demonstrate this by showing that the trained models explicitly learn a desirable distribution. Our framework is the first generative approach to successfully address constrained synthesizability.