ICLR Evaluating Generalization in GFlowNets for Molecule Design

Poster
in
Workshop: Machine Learning for Drug Discovery (MLDD)

Evaluating Generalization in GFlowNets for Molecule Design

Andrei Nica · Moksh Jain · Emmanuel Bengio · Cheng-Hao Liu · Maksym Korablyov · Michael Bronstein · Yoshua Bengio

Keywords: [ diversity ] [ Molecule Design ]

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

Deep learning bears promise for drug discovery problems such as de novo molecular design. Generating data to train such models is a costly and time-consuming process, given the need for wet-lab experiments or expensive simulations. This problem is compounded by the notorious data-hungriness of machine learning algorithms. In small molecule generation the recently proposed GFlowNet method has shown good performance in generating diverse high-scoring candidates and has the interesting advantage of being an off-policy offline method. Finding an appropriate generalization evaluation metric for such models, one predictive of the desired search performance (i.e. finding high-scoring diverse candidates), will help guide online data collection for such an algorithm. In this work, we develop techniques for evaluating GFlowNet performance on a test set, and identify the most promising metric for predicting generalization. We present empirical results on several small-molecule design tasks in drug discovery, for several GFlowNet training setups, and we find a metric strongly correlated with diverse high-scoring batch generation. This metric should be used to identify the best generative model from which to sample batches of molecules to be evaluated.

Chat is not available.

Poster in Workshop: Machine Learning for Drug Discovery (MLDD)

Evaluating Generalization in GFlowNets for Molecule Design

Andrei Nica · Moksh Jain · Emmanuel Bengio · Cheng-Hao Liu · Maksym Korablyov · Michael Bronstein · Yoshua Bengio

Poster
in
Workshop: Machine Learning for Drug Discovery (MLDD)