Geometric Deep Learning: the Erlangen Programme of ML
Moderator : Guido Montufar
For nearly two millennia, the word "geometry" was synonymous with Euclidean geometry, as no other types of geometry existed. Euclid's monopoly came to an end in the 19th century when multiple examples of non-Euclidean geometries were constructed. However, these studies quickly diverged into disparate fields, with mathematicians debating the relations between different geometries and what defines one. A way out of this pickle was shown by Felix Klein in his Erlangen Programme, which proposed approaching geometry as the study of invariants or symmetries using the language of group theory. In the 20th century, these ideas have been fundamental in developing modern physics, culminating in the Standard Model.
The current state of deep learning somewhat resembles the situation in the field of geometry in the 19h century: On the one hand, in the past decade deep learning has brought a revolution in data science and made possible many tasks previously thought to be beyond reach -- including computer vision, playing Go, or protein folding. At the same time, we have a zoo of neural network architectures for various kinds of data, but few unifying principles. As in times past, it is difficult to understand the relations between different methods, inevitably resulting in the reinvention and re-branding of the same concepts.
Geometric Deep Learning aims to bring geometric unification to deep learning in the spirit of the Erlangen Programme. Such an endeavour serves a dual purpose: it provides a common mathematical framework to study the most successful neural network architectures, such as CNNs, RNNs, GNNs, and Transformers, and gives a constructive procedure to incorporate prior knowledge into neural networks and build future architectures in a principled way. In this talk, I will overview the mathematical principles underlying Geometric Deep Learning on grids, graphs, and manifolds, and show some of the exciting and groundbreaking applications of these methods in a broad range of domains.
(based on joint work with J. Bruna, T. Cohen, and P. Veličković)