Skip to yearly menu bar Skip to main content


Poster

Generalization and Distributed Learning of GFlowNets

Tiago Silva · Amauri Souza · Omar Rivasplata · Vikas Garg · Samuel Kaski · Diego Mesquita

Hall 3 + Hall 2B #577
[ ]
Fri 25 Apr 7 p.m. PDT — 9:30 p.m. PDT

Abstract:

Conventional wisdom attributes the success of Generative Flow Networks (GFlowNets) to their ability to exploit the compositional structure of the sample space for learning generalizable flow functions (Bengio et al., 2021). Despite the abundance of empirical evidence, formalizing this belief with verifiable non-vacuous statistical guarantees has remained elusive. We address this issue with the first data-dependent generalization bounds for GFlowNets. We also elucidate the negative impact of the state space size on the generalization performance of these models via Azuma-Hoeffding-type oracle PAC-Bayesian inequalities. We leverage our theoretical insights to design a novel distributed learning algorithm for GFlowNets, which we call Subgraph Asynchronous Learning (SAL). In a nutshell, SAL utilizes a divide-and-conquer strategy: multiple GFlowNets are trained in parallel on smaller subnetworks of the flow network, and then aggregated with an additional GFlowNet that allocates appropriate flow to each subnetwork. Our experiments with synthetic and real-world problems demonstrate the benefits of SAL over centralized training in terms of mode coverage and distribution matching.

Live content is unavailable. Log in and register to view live content