Keywords: [ importance sampling ]
Stochastic differential equations provide a rich class of flexible generativemodels, capable of describing a wide range of spatio-temporal processes. A hostof recent work looks to learn data-representing SDEs, using neural networks andother flexible function approximators. Despite these advances, learning remainscomputationally expensive due to the sequential nature of SDE integrators. Inthis work, we propose an importance-sampling estimator for probabilities ofobservations of SDEs for the purposes of learning. Crucially, the approach wesuggest does not rely on such integrators. The proposed method produceslower-variance gradient estimates compared to algorithms based on SDEintegrators and has the added advantage of being embarrassingly parallelizable.This facilitates the effective use of large-scale parallel hardware for massivedecreases in computation time.