The capacity of existing graph neural networks (GNNs) is limited by the use of local or isotropic kernels. We present Graph Anisotropic Diffusion: a new GNN architecture that utilizes a diffusion PDE on graphs. The main ingredient in our model is a linear diffusion layer with two solving schemes that is applied independently to each feature channel of the vertices in the graph. This learned diffusion layer improves the message passing mechanism in GNNs by allowing continuous propagation of information among nodes with control over the diffusion step. This diffusion layer is combined with local anisotropic kernels to obtain a notion of direction. Empirically we demonstrate the capacity of our model to improve the performance of competitive GNNs in two common molecular property prediction benchmarks.