Over the past two decades, highthroughput data collection technologies have become commonplace in most fields of science and technology, and with them an everincreasing amount of big high dimensional data is being generated by virtually every realworld system. While such data systems are highly diverse in nature, the underlying data analysis and exploration tasks give rise to common challenges at the core of modern representation learning. For example, even though modern realworld data typically exhibit highdimensional ambient measurement spaces, they often exhibit lowdimensional intrinsic structures that can be uncovered by geometryoriented methods, such as the ones encountered in manifold learning, graph signal processing, geometric deep learning, and topological data analysis. As a result, recent years have seen significant interest and progress in geometric and topological approaches to representation learning, thus enabling tractable exploratory analysis by domain experts who frequently do not have a strong computational background.Motivation. Despite increased interest in the aforementioned methods, there is no forum in which to present work in progress to get the feedback of the machine learning community. Knowing the diverse backgrounds of researchers visiting ICLR, we consider this venue to be the perfect opportunity to bring together domain experts, practitioners, and researchers that are developing the nextgeneration computational methods. In our opinion, such discussions need to be held in an inclusive setting, getting feedback from different perspectives to improve the work and advance the state of the art. Our workshop provides a unique forum for disseminating (preliminary) research in fields that are not yet fully covered by the main conference. Our overarching goal is to deepen our understanding of challenges/opportunities, while breaking barriers between disjoint communities, emphasizing collaborative efforts in different domains.
Fri 5:00 a.m.  5:10 a.m.

Opening Remarks
(
Live
)

🔗 
Fri 5:10 a.m.  5:30 a.m.

Summary of Previous Workshops
(
Live
)

🔗 
Fri 5:30 a.m.  6:00 a.m.

Bernadette Stolz: Topological Data Analysis and Geometric Anomaly Detection ( Foundation Talk ) link »  🔗 
Fri 6:00 a.m.  6:20 a.m.

Roland Kwitt: Topologically Densified Distributions
(
Invited Talk
)
link »
In this talk, I am going to discuss some recent advances in the context of (topological) regularization for small samplesize learning with overparametrized neural networks. Specifically, I will shift focus from architectural properties, such as norms on the network weights, to properties of the internal representations before a linear classifier. In particular, I will advocate a topological constraint on samples drawn from the probability measure induced in that space. This provably leads to mass concentration effects around the representations of training instances, i.e., a property beneficial for generalization. Importantly, the topological constraints can be imposed in an efficient manner by leveraging results from prior work. A series of experiments on popular (vision) benchmarks provides strong empirical evidence to support the claim for better generalization in the small samplesize regime. 
🔗 
Fri 6:20 a.m.  6:30 a.m.

Break

🔗 
Fri 6:30 a.m.  7:10 a.m.

Panel D: Bridging Theory and Practice
(
Panel Discussion (live)
)

🔗 
Fri 7:10 a.m.  7:40 a.m.

Smita Krishnaswamy
(
Foundation Talk
)

🔗 
Fri 7:40 a.m.  8:00 a.m.

Stefanie Jegelka: Sign and Basis Invariant Networks for Spectral Graph Representation Learning ( Invited Talk ) link »  🔗 
Fri 8:00 a.m.  8:40 a.m.

Panel C: TopologyDriven Machine Learning
(
Panel Discussion (live)
)

🔗 
Fri 8:40 a.m.  8:45 a.m.

Neural Approximation of Extended Persistent Homology on Graphs
(
Spotlight
)
link »
SlidesLive Video » Persistent homology is a widely used theory in topological data analysis. In the context of graph learning, topological features based on persistent homology have been used to capture potentially highorder structural information so as to augment existing graph neural network methods. However, computing extended persistent homology summaries remains slow for large and dense graphs. Inspired by recent success in neural algorithmic reasoning, we propose a novel learning method to compute extended persistence diagrams on graphs. The proposed neural network aims to simulate a specific algorithm and learns to compute extended persistence diagrams for new graphs efficiently. Experiments on approximating extended persistence diagrams and several downstream graph representation learning tasks demonstrate the effectiveness of our method. Our method is also efficient; on large and dense graphs, we accelerate the computation by nearly 100 times. 
Zuoyu Yan · Tengfei Ma · Liangcai Gao · Zhi Tang · Yusu Wang · Chao Chen 🔗 
Fri 8:45 a.m.  8:50 a.m.

RipsNet: a general architecture for fast and robust estimation of the persistent homology of point clouds
(
Spotlight
)
link »
The use of topological descriptors in modern machine learning applications, such as Persistence Diagrams (PDs) arising from Topological Data Analysis (TDA), has shown great potential in various domains.However, their practical use in applications is often hindered by two major limitations: the computational complexity required to compute such descriptors exactly, and their sensitivity to even lowlevel proportions of outliers. In this work, we propose to bypass these two burdens in a datadriven setting by entrusting the estimation of (vectorization of) PDs built on top of point clouds to a neural network architecture that we call RipsNet. Once trained on a given data set, RipsNet can estimate topological descriptors on test data very efficiently with generalization capacity. Furthermore, we prove that RipsNet is robust to input perturbations in terms of the 1Wasserstein distance, a major improvement over the standard computation of PDs that only enjoys Hausdorff stability, yielding RipsNet to substantially outperform exactlycomputed PDs in noisy settings. We showcase the use of RipsNet on both synthetic and realworld data. Our implementation will be made freely and publicly available as part of an opensource library. 
Thibault de Surrel · Felix Hensel · Mathieu Carrière · Théo Lacombe · Yuichi Ike · Hiroaki Kurihara · Marc Glisse · Frederic Chazal 🔗 
Fri 8:50 a.m.  8:55 a.m.

Pretraining Molecular Graph Representation with 3D Geometry
(
Spotlight
)
link »
SlidesLive Video » Molecular graph representation learning is a fundamental problem in modern drug and material discovery. Molecular graphs are typically modeled by their 2D topological structures, but it has been recently discovered that 3D geometric information plays a more vital role in predicting molecular functionalities. However, the lack of 3D information in realworld scenarios has significantly impeded the learning of geometric graph representation. To cope with this challenge, we propose the Graph MultiView Pretraining (GraphMVP) framework where selfsupervised learning (SSL) is performed by leveraging the correspondence and consistency between 2D topological structures and 3D geometric views. GraphMVP effectively learns a 2D molecular graph encoder that is enhanced by richer and more discriminative 3D geometry. We further provide theoretical insights to justify the effectiveness of GraphMVP. Finally, comprehensive experiments show that GraphMVP can consistently outperform existing graph SSL methods. 
Shengchao Liu · Hanchen Wang · Weiyang Liu · Joan Lasenby · Hongyu Guo · Jian Tang 🔗 
Fri 8:55 a.m.  9:00 a.m.

Group Symmetry in PAC Learning
(
Spotlight
)
link »
SlidesLive Video » In this paper we show rigorously how learning in the PAC framework with invariant or equivariant hypotheses reduces to learning in a space of orbit representatives. Our results hold for any compact group, including infinite groups such as rotations. In addition, we show how to use these equivalences to derive generalisation bounds for invariant/equivariant models in terms of the geometry of the input and output spaces. To the best of our knowledge, our results are the most general of their kind to date. 
Bryn Elesedy 🔗 
Fri 9:00 a.m.  10:00 a.m.

Poster Session I ( Poster session on Gather.Town ) link »  🔗 
Fri 10:00 a.m.  10:25 a.m.

Chad Topaz: Topological Data Analysis of Collective Motion
(
Invited Talk
)
link »
Collective behaviors abound anywhere in nature that objects or agents interact. Investigators modeling collective behavior face a variety of challenges involving data from simulation and/or experiment. These challenges include exploring large, complex data sets to understand and characterize the system, inferring the model parameters that most accurately reflect a given data set, and assessing the goodnessoffit between experimental data sets and proposed models. Topological data analysis provides a lens through which these challenges may be addressed. I will highlight how topological techniques, sometimes in concert with machine learning, can be applied to models arising from the study of groups displaying collective motion, such as bird flocks, fish schools, and insect swarms. The key approach is to characterize a system’s dynamics via the timeevolution of topological invariants called Betti numbers, accounting for persistence of topological features across multiple scales. 
🔗 
Fri 10:25 a.m.  10:40 a.m.

Jonathan Godwin: Engineering Graph Nets ( Invited Talk ) link »  🔗 
Fri 10:40 a.m.  10:50 a.m.

Tara Chari ( Case Study ) link »  🔗 
Fri 10:50 a.m.  11:30 a.m.

Panel A: DataDriven Manifold Learning
(
Panel Discussion (live)
)

🔗 
Fri 11:30 a.m.  11:35 a.m.

TopTemp: Parsing Precipitate Structure from Temper Topology
(
Spotlight
)
link »
Technological advances are in part enabled by the development of novel manufacturing processes that give rise to new materials or material property improvements. Development and evaluation of new manufacturing methodologies is labor, time, and resourceintensive expensive due to complex, poorly defined relationships between advanced manufacturing process parameters and the resulting microstructures. In this work, we present a topological representation of temper (heattreatment) dependent material microstructure, as captured by scanning electron microscopy, called TopTemp. We show that this topological representation is able to support temper classification of microstructures in a data limited setting, generalizes well to previously unseen samples, is robust to image perturbations, and captures domain interpretable features. The presented work outperforms conventional deep learning baselines and is a first step towards improving understanding of process parameters and resulting material properties. 
Tegan Emerson · Lara Kassab · Scott Howland · Henry Kvinge · Keerti Kappagantula 🔗 
Fri 11:35 a.m.  11:40 a.m.

A Piecewise Polynomial Filtering Approach for Graph Neural Networks
(
Spotlight
)
link »
SlidesLive Video » Graph Neural Networks (GNNs) exploit signals from node features and the input graph topology to improve node classification task performance. Recently proposed GNNs work across a variety of homophilic and heterophilic graphs. Among these, models relying on polynomial graph filters have shown promise. We observe that polynomial filter models, in several practical instances, need to learn a reasonably high degree polynomials without facing any oversmoothing effects. We find that existing methods, due to their designs, either have limited efficacy or can be enhanced further. We present a spectral method to learn a bank of filters using a piecewise polynomial approach, where each filter acts on a different subsets of the eigen spectrum. The approach requires eigendecomposition for a few eigenvalues at extremes (i.e., low and high ends of the spectrum) and offers flexibility to learn sharper and complex shaped frequency responses with lowdegree polynomials. We theoretically and empirically show that our proposed model learns a better filter, thereby improving classification accuracy. Our model achieves performance gains of up to ~6% over the stateoftheart (SOTA) models. 
Vijay Lingam · Chanakya Ekbote · Manan Sharma · Rahul Ragesh · Arun Iyer · SUNDARARAJAN SELLAMANICKAM 🔗 
Fri 11:40 a.m.  11:45 a.m.

Message passing all the way up
(
Spotlight
)
link »
SlidesLive Video » The message passing framework is the foundation of the immense success enjoyed by graph neural networks (GNNs) in recent years. In spite of its elegance, there exist many problems it provably cannot solve over given input graphs. This has led to a surge of research on going beyond message passing, building GNNs which do not suffer from those limitationsa term which has become ubiquitous in regular discourse. However, have those methods truly moved beyond message passing? In this position paper, I argue about the dangers of using this termespecially when teaching graph representation learning to newcomers. I show that any function of interest we want to compute over graphs can, in all likelihood, be expressed using pairwise message passing  just over a potentially modified graph, and argue how most practical implementations subtly do this kind of trick anyway. Hoping to initiate a productive discussion, I propose replacing beyond message passing with a more tame term, augmented message passing. 
Petar Veličković 🔗 
Fri 11:45 a.m.  11:50 a.m.

Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs
(
Spotlight
)
link »
Cellular sheaves equip graphs with ``geometrical'' structure by assigning vector spaces and linear maps to nodes and edges. Graph Neural Networks (GNNs) implicitly assume a graph with a trivial underlying sheaf. This choice is reflected in the structure of the graph Laplacian operator, the properties of the associated diffusion equation, and the characteristics of the convolutional models that discretise this equation. In this paper, we use cellular sheaf theory to show that the underlying geometry of the graph is deeply linked with the performance of GNNs in heterophilic settings and their oversmoothing behaviour. By considering a hierarchy of increasingly general sheaves, we study how the ability of the sheaf diffusion process to achieve linear separation of the classes in the infinite time limit expands. At the same time, we prove that when the sheaf is nontrivial, discretised parametric diffusion processes have greater control than GNNs over their asymptotic behaviour. On the practical side, we study how sheaves can be learned from data. The resulting sheaf diffusion models have many desirable properties that address the limitations of classical graph diffusion equations (and corresponding GNN models) and obtain stateoftheart results in heterophilic settings. Overall, our work provides new connections between GNNs and algebraic topology and would be of interest to both fields. 
Cristian Bodnar · Francesco Di Giovanni · Benjamin Chamberlain · Pietro Lio · Michael Bronstein 🔗 
Fri 11:50 a.m.  12:50 p.m.

Poster Session II ( Poster session on Gather.Town ) link »  🔗 
Fri 12:50 p.m.  1:00 p.m.

Dmitry Kobak: What are 2D neighbour embeddings of scRNAseq data actually useful for? ( Case Study ) link »  🔗 
Fri 1:00 p.m.  1:50 p.m.

Panel B: LongRange Graph Representation Learning
(
Panel Discussion (live)
)

🔗 
Fri 1:40 p.m.  1:50 p.m.

Jessica Moore: G2 stem cells orchestrate timedirected, longrange coordination of calcium signaling during skin epidermal regeneration ( Case Study ) link »  🔗 
Fri 1:50 p.m.  2:00 p.m.

Closing Remarks
(
Live
)

🔗 


EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction
(
Poster
)
link »
Predicting how a druglike molecule binds to a specific protein target is a core problem in drug discovery. An extremely fast computational binding method would enable key applications such as fast virtual screening or drug engineering. Existing methods are computationally expensive as they rely on heavy candidate sampling coupled with scoring, ranking, and finetuning steps. We challenge this paradigm with EquiBind, an SE(3)equivariant geometric deep learning model performing directshot prediction of both i) the receptor binding location (blind docking) and ii) the ligand's bound pose and orientation. EquiBind achieves significant speedups and better quality compared to traditional and recent baselines. Further, we show extra improvements when coupling it with existing finetuning techniques at the cost of increased running time. Finally, we propose a novel and fast finetuning model that adjusts torsion angles of a ligand's rotatable bonds based on closedform global minima of the von Mises angular distance to a given input atomic point cloud, avoiding previous expensive differential evolution strategies for energy minimization. 
Hannes Stärk · Octavian Ganea · Lagnajit Pattanaik · Regina Barzilay · Tommi Jaakkola 🔗 


Efficient Representation Learning of Subgraphs by SubgraphToNode Translation
(
Poster
)
link »
A subgraph is a data structure that can represent various realworld problems. We propose SubgraphToNode (S2N) translation, which is a novel formulation to efficiently learn representations of subgraphs. Specifically, given a set of subgraphs in the global graph, we construct a new graph by coarsely transforming subgraphs into nodes. We perform subgraphlevel tasks as nodelevel tasks through this translation. By doing so, we can significantly reduce the memory and computational costs in both training and inference. We conduct experiments on four realworld datasets to evaluate performance and efficiency. Our experiments demonstrate that models with S2N translation are more efficient than stateoftheart models without substantial performance decrease. 
Dongkwan Kim · Alice Oh 🔗 


On the Inadequacy of CKA as a Measure of Similarity in Deep Learning
(
Poster
)
link »
Comparing learned representations is a challenging problem which has been approached in different ways. The CKA similarity metric, particularly it's linear variant, has recently become a popular approach and has been widely used to compare representations of a network's different layers, of similar networks trained differently, or of models with different architectures trained on the same data. CKA results have been used to make a wide variety of claims about similarity and dissimilarity of these various representations. In this work we investigate several weaknesses of the CKA similarity metric, demonstrating situations in which it gives unexpected or counterintuitive results. We then study approaches for modifying representations to maintain functional behaviour while changing the CKA value. Indeed we illustrate in some cases the CKA value can be heavily manipulated without substantial changes to the functional behaviour. 
MohammadReza Davari · Stefan Horoi · Amine Natik · Guillaume Lajoie · Guy Wolf · Eugene Belilovsky 🔗 


WEISFEILER AND LEMAN GO INFINITE: SPECTRAL AND COMBINATORIAL PRECOLORINGS
(
Poster
)
link »
Two popular alternatives for graph isomorphism testing that offer a good tradeoff between expressive power and computational efficiency are combinatorial (i.e., obtained via the WeisfeilerLeman (WL) test) and spectral invariants. While the exact power of the latter is still an open question, the former is regularly criticized for its limited power, when a standard configuration of uniform precoloring is used. This drawback hinders the applicability of Message Passing Graph Neural Networks (MPGNNs), whose expressive power is upper bounded by the WL test. Relaxing the assumption of uniform precoloring, we show that one can increase the expressive power of the WL test ad infinitum. Following that, we propose an efficient precoloring based on spectral features that provably increase the expressive power of the vanilla WL test.The code to reproduce our experiments is available at \url{https://github.com/TPFI22/SpectralandCombinatorial} 
Or Feldman · Amit Boyarski · Shai Feldman · Dani Kogan · Avi Mendelson · Chaim Baskin 🔗 


ON RECOVERABILITY OF GRAPH NEURAL NETWORK REPRESENTATIONS
(
Poster
)
link »
Despite their growing popularity, graph neural networks (GNNs) still have multiple unsolved problems, including finding more expressive aggregation methods, propagation of information to distant nodes, and training on largescale graphs.Understanding and solving such problems require developing analytic tools and techniques.In this work, we propose the notion of \textit{recoverability}, which is tightly related to information aggregation in GNNs,and based on this concept, develop the method for GNN embedding analysis.Through extensive experimental results on various datasets and different GNN architectures, we demonstrate that estimated recoverability correlates with aggregation method expressivity and graph sparsification quality.The code to reproduce our experiments is available at \url{https://github.com/Anonymous1252022/Recoverability}. 
Maxim Fishman · Chaim Baskin · Evgenii Zheltonozhskii · Ron Banner · Avi Mendelson 🔗 


SpeqNets: Sparsityaware Permutationequivariant Graph Networks
(
Poster
)
link »
While graph neural networks have clear limitations in approximating permutationequivariant functions over graphs, more expressive, higherorder graph neural networks do not scale to large graphs. By introducing new heuristics for the graph isomorphism problem, we devise a class of universal, permutationequivariant graph networks, which offers a finegrained control between expressivity and scalability and adapt to the sparsity of the graph. These architectures lead to vastly reduced computation times compared to standard higherorder graph networks while significantly improving over standard graph neural network and graph kernel architectures in terms of predictive performance. 
Christopher Morris · Gaurav Rattan · Sandra Kiefer · Siamak Ravanbakhsh 🔗 


Denoising Diffusion Probabilistic Models on SO(3) for Rotational Alignment
(
Poster
)
link »
Probabilistic diffusion models are capable of modeling complex data distributions on highdimensional Euclidean spaces for a range applications. However, many real world tasks involve more complex structures such as data distributions defined on manifolds which cannot be easily represented by diffusions on $\mathbb{R}^n$. This paper proposes denoising diffusion models for tasks involving 3D rotations leveraging diffusion processes on the Lie group $SO(3)$ in order to generate candidate solutions to rotational alignment tasks. The experimental results show the proposed $SO(3)$ diffusion process outperforms naïve approaches such as Euler angle diffusion in synthetic rotational distribution sampling and in a 3D object alignment task.

Adam Leach · Sebastian Schmon · Matteo Degiacomi · Chris G Willcocks 🔗 


DiffusionBased Methods for Estimating Curvature in Data
(
Poster
)
link »
Highthroughput highdimensional data is now being generated in massive quantities in many fields including biology, medicine, chemistry, finance, and physics. Researchers have successfully used manifold learning in order to gain insight from such data, particularly in biomedical and singlecell data. One such technique, data diffusion geometry, has been useful in understanding manifold intrinsic distances, density, and major nonlinear axes or paths through the data. However, a relatively unstudied feature of highdimensional data is curvature. While curvature is welldefined and easy to compute in low dimensions, it poses computational and conceptual difficulties in high dimensions. Here, we present two techniques to estimate curvature from highdimensional data starting from data diffusion probabilities. The first technique, diffusion curvature, uses the spread or conversely laziness of a random walk to estimate curvature pointwise in data. The second technique, deep diffusion curvature, trains a neural network to estimate pointwise curvature. Since these techniques are scalable, we anticipate that they can be used to describe and compare datasets as well as find points in data that represent transitional entities. 
Dhananjay Bhaskar · Kincaid MacDonald · Dawson Thomas · Sarah Zhao · Kisung You · Jennifer Paige · Yariv Aizenbud · Bastian Rieck · Ian Adelstein · Smita Krishnaswamy 🔗 


HIGH SKIP NETWORKS: A HIGHER ORDER GENERALIZATION OF SKIP CONNECTIONS
(
Poster
)
link »
We present High Skip Networks (HSNs), a higher order generalization of skip connection neural networks to simplicial complexes. HSNs exploit higher order structure encoded in a simplicial domain by creating multiple feedforward paths of signals computed over the input complex. Some feedforward paths may propagate the signal through various higher order structures; e.g., if we want to propagate signals over edges, some feedforward paths may go from edges to triangles and then back to edges. Similar to the Euclidean skip connection networks, all paths are combined together at the end by addition or concatenation. We demonstrate the effectiveness of HSNs on synthetic and real datasets. Our preliminary results show that HSNs lead to a statistically significant improvement in the generalization error when compared to base models without high skip components. 
Mustafa Hajij · Karthikeyan Natesan Ramamurthy · Aldo GuzmánSáenz · Ghada Za 🔗 


Message Passing Neural Processes
(
Poster
)
link »
Neural Processes (NPs) are powerful and flexible models able to incorporate uncertainty when representing stochastic processes, while maintaining a linear time complexity. However, NPs produce a latent description by aggregating independent representations of context points and lack the ability to exploit relational information present in many datasets. This renders NPs ineffective in settings where the stochastic process is primarily governed by neighbourhood rules, such as cellular automata (CA), and limits performance for any task where relational information remains unused. We address this shortcoming by introducing Message Passing Neural Processes (MPNPs), the first class of NPs that explicitly makes use of relational structure within the model. Our evaluation shows that MPNPs thrive at lower sampling rates, on existing benchmarks and newlyproposed CA and CoraBranched tasks. We further report strong generalisation over densitybased CA rulesets and significant gains in challenging arbitrarylabelling and fewshot learning setups. 
Catalina Cangea · Ben Day · Arian Jamasb · Pietro Lio 🔗 


Graph Anisotropic Diffusion
(
Poster
)
link »
Traditional Graph Neural Networks (GNNs) rely on message passing, which amounts to permutationinvariant local aggregation of neighbour features. Such a process is isotropic and there is no notion of `direction' on the graph. We present a new GNN architecture called Graph Anisotropic Diffusion. Our model alternates between linear diffusion, for which a closedform solution is available, and local anisotropic filters to obtain efficient multihop anisotropic kernels. We test our model on two common molecular property prediction benchmarks (ZINC and QM9) and show its competitive performance. 
Ahmed Elhag · Gabriele Corso · Hannes Stärk · Michael Bronstein 🔗 


Graph Neural Networks are Dynamic Programmers
(
Poster
)
link »
Recent advances in neural algorithmic reasoning with graph neural networks (GNNs) are propped up by the notion of algorithmic alignment. Broadly, a neural network will be better at learning to execute a reasoning task (in terms of sample complexity) if its individual components align well with the target algorithm. Specifically, GNNs are claimed to align with dynamic programming (DP), a general problemsolving strategy which expresses many polynomialtime algorithms. However, has this alignment truly been demonstrated and theoretically quantified? Here we show, using methods from category theory and abstract algebra, that there exists an intricate connection between GNNs and DP, going well beyond the initial observations over individual algorithms such as BellmanFord. Exposing this connection, we easily verify several prior findings in the literature, and hope it will serve as a foundation for building stronger algorithmically aligned GNNs. 
Andrew Dudzik · Petar Veličković 🔗 


Cycle Representation Learning for Inductive Relation Prediction
(
Poster
)
link »
Inductive relation prediction is an important learning task for knowledge graph completion. To predict the relation between two entities, one can use the existence of rules, namely a sequence of relations. Previous works primarily focus on searching the rules between entities. The space of rules is huge, and one has to sacrifice either efficiency or accuracy. In this paper, we consider rules as cycles and show that the space of cycles has a unique structure based on the mathematics of algebraic topology. By exploring the linear structure of the cycle space, we can improve the searching efficiency of rules. We propose to collect cycle bases that span the space of cycles. We build a novel GNN framework on the collected cycles to learn the representations of cycles, and to predict the existence/nonexistence of a relation. Our method achieves stateoftheart performance on popular benchmarks. 
Zuoyu Yan · Tengfei Ma · Liangcai Gao · Zhi Tang · Chao Chen 🔗 


Riemannian Neural SDE: Learning Stochastic Representations on Manifolds
(
Poster
)
link »
In recent years, the neural stochastic differential equation (NSDE) has gained attention in modeling stochastic representations, while NSDE brings a great success in various types of applications. However, it typically loses the expressivity when the data representation is manifoldvalued. To overcome such an issue, we suggest a principled way to express the stochastic representation with the Riemannian neural SDE (RNSDE), which extends the conventional Euclidean NSDE. Empirical results on the density estimation on manifolds show that the proposed method significantly outperforms baseline methods. 
Sung Woo Park · Hyomin Kim · Hyeseong Kim · Junseok Kwon 🔗 


Simplicial Attention Networks
(
Poster
)
link »
Graph representation learning methods have mostly been limited to the modelling of nodewise interactions. Recently, there has been an increased interest in understanding how higherorder structures can be utilised to further enhance the learning abilities of graph neural networks (GNNs) in combinatorial spaces. Simplicial Neural Networks (SNNs) naturally model these interactions by performing message passing on simplicial complexes, higherdimensional generalisations of graphs. Nonetheless, the computations performed by most existent SNNs are strictly tied to the combinatorial structure of the complex. Leveraging the success of attention mechanisms in structured domains, we propose Simplicial Attention Networks (SAT), a new type of simplicial network that dynamically weighs the interactions between neighbouring simplicies and can readily adapt to novel structures. Additionally, we propose a signed attention mechanism that makes SAT orientation equivariant, a desirable property for models operating on (co)chain complexes. We demonstrate that SAT outperforms existent convolutional SNNs and GNNs in two image and trajectory classification tasks. 
Christopher Goh · Cristian Bodnar · Pietro Lio 🔗 


Gauge Equivariant Deep QLearning on Discrete Manifolds
(
Poster
)
link »
Data, at any point on a manifold, can be represented on the tangent plane at that point with respect to a basis, called a gauge. But the choice of gauge is not unique for arbitrary manifolds. Hence, for agents traversing an environment embedded on a manifold, the same environment may appear differently if the choice of gauge changes or when moving to a different point that has a different gauge. This may be deleterious to an agent's learning, as compared to learning on, say, a flat grid where it is easy to choose a fixed gauge for each point. To this end, we provide a formulation of deep Qlearning that learns policies (and Qvalues) that are equivariant (invariant) to changes in choice of gauge. This leads to an efficient learning algorithm independent of the choice of gauge. Our experimental results demonstrate significant improvement in learning on novel environments embedded in arbitrary manifolds such as spheres, hills, and urns, compared to naive approaches. 
Sourya Basu · Pulkit Katdare · Katherine DriggsCampbell · Lav R Varshney 🔗 


Manifoldaligned Neighbor Embedding
(
Poster
)
link »
In this paper, we introduce a neighbor embedding framework for manifold alignment. We demonstrate the efficacy of the framework using a manifoldaligned version of the uniform manifold approximation and projection algorithm. We show that our algorithm can learn an aligned manifold that is visually competitive to embedding of the whole dataset. 
Mohammad Tariqul Islam · Jason Fleischer 🔗 


Decoupled Graph Neural Networks based on Label Agreement Message Propagation
(
Poster
)
link »
Decoupling has become a new paradigm in Graph Neural Networks (GNNs) for its effectiveness and scalability. However, this paradigm still faces two several restrictions: unsatisfying propagation, caused by noisy or confused edges, could greatly degrade model performance; fixed aggregation schema with the same propagation steps and the same combination weights for each node limit achieving optimal performance. To address these problems, we propose a novel decoupled graph model named LADGNN based on label agreement message propagation and combine the intermediate feature after each propagation step as input. In our method, we decouple the graph model which trains a base predictor based on multilayer perceptrons with a prestep to propagate features and a poststep to propagate labels. We utilize an auxiliary label agreement model to generate proper edge weights to promote reliable propagation. When training the base predictor, we concatenate all intermediate features after each propagation step to make the model dynamically learn information of neighbors at different distances. Extensive experiments on five realworld datasets demonstrate that our method achieves superior performance over all baseline methods in terms of node classification accuracy. 
Zhicheng An · Zhengwei Wu · Binbin Hu · Zhiqiang Zhang · JUN ZHOU · Yue Wang · ShaoLun Huang 🔗 


Reducing Learning on Cell Complexes to Graphs
(
Poster
)
link »
Message passing graph neural networks (GNNs) are known to have a limited expressiveness in distinguishing graphs. A recent approach tackles this problem by transforming graphs to regular cell complexes. This makes it possible to model higherorder structures and yields algorithms that are more powerful than the Weisfeiler Leman test (WL) or GNNs. However, this approach cannot easily be combined with previous graph algorithms and implementations due to their fundamental differences. We develop Cell Encoding, a novel approach of transforming regular cell complexes to graphs. We show that cell encoding combined with WL or a suitably expressive GNN is at least as expressive as Cellular Weisfeiler Leman (CWL) in distinguishing cell complexes. This means that with a simple preprocessing one can use any GNN for learning tasks on cell complexes. Additionally, we show that this approach can make GNNs more expressive and give better results on graph classification datasets. 
Fabian Jogl · Maximilian Thiessen · Thomas Gärtner 🔗 


Multiscale Physical Representations for Approximating PDE Solutions with Graph Neural Operators
(
Poster
)
link »
Representing physical signals at different scales is among the most challenging problems in engineering. Several multiscale modeling tools have been developed to describe physical systems governed by Partial Differential Equations (PDEs). These tools are at the crossroad of principled physical models and numerical schema. Recently, datadriven models have been introduced to speedup the approximation of PDE solutions compared to numerical solvers. Among these recent datadriven methods, neural integral operators are a class that learn a mapping between function spaces. These functions are discretized on graphs (meshes) which are appropriate for modeling interactions in physical phenomena. In this work, we study three multiresolution schema with integral kernel operators that can be approximated with Message Passing Graph Neural Networks (MPGNNs). To validate our study, we make extensive MPGNNs experiments with wellchosen metrics considering steady and unsteady PDEs. Code: https://github.com/LeonMigu/multiscalegraphneuraloperator 
Léon Migus · Yuan Yin · Ahmed Mazari · patrick gallinari 🔗 


Learning Weighted Product Spaces Representations for Graphs of Heterogeneous Structures
(
Poster
)
link »
In representation learning, it is important to learn embedding spaces whose geometry matches the underlying structure of the data. In the literature, an active research direction aims at using product spaces, which consists of Euclidean and nonEuclidean manifolds to represent data of varying curvatures. However, realworld data is usually heterogeneous and consists of a mixture of varying structures, requiring the representation learning process to flexibly select and combine the member spaces accordingly. Since previous works only consider combination of embedding spaces with equal weights, in this paper, we propose a datadriven method to learn the embeddings in a weighted product spaces for graph data. Specifically, our model utilizes the topological information of input graph to learn the weight for each component of the product spaces. Experiments on synthetic and realworld datasets show that our models produce better representations in terms of distortion measures, and perform better on tasks such as word similarity learning. 
Tuc Nguyen · Dung Le · Anh Ta 🔗 


On subsampling and inference for multiparameter persistence homology
(
Poster
)
link »
In topological data analysis, multiparameter persistence homology is a framework for extracting topological information for point cloud datasets equipped with multiple filtrations. However, existing algorithms become computationally expensive for large datasets, limiting their applicability. In the singleparameter case, subsampling algorithms have been used to reduce the time complexity of computing persistence homology. Convergence properties of the persistence barcodes have also been established in this setting. We extend these results to the multiparameter persistence homology, and develop subsampling algorithms can be used to approximate the fibered barcode in this setting. We conduct experiments on the point cloud dataset ModelNet to demonstrate the efficiency of these algorithms. 
Vinoth Nandakumar 🔗 


Sign and Basis Invariant Networks for Spectral Graph Representation Learning
(
Poster
)
link »
Many machine learning tasks involve processing eigenvectors derived from data. Especially valuable are Laplacian eigenvectors, which capture useful structural information about graphs and other geometric objects. However, ambiguities arise when computing eigenvectors: for each eigenvector v, the sign flipped v is also an eigenvector. More generally, higher dimensional eigenspaces contain infinitely many choices of eigenvector bases. In this work we introduce SignNet and BasisNet  new neural architectures that are invariant to all requisite symmetries and hence process collections of eigenspaces in a principled manner. Our networks are universal, i.e., they can approximate any continuous function of eigenvectors with the proper invariances. They are also theoretically strong for graph representation learning  they can provably approximate any spectral graph convolution, spectral invariants that go beyond message passing neural networks, and other graph positional encodings. Experiments show the strength of our networks for learning spectral graph filters and learning graph positional encodings. 
Derek Lim · Joshua Robinson · Lingxiao Zhao · Tess Smidt · Suvrit Sra · Haggai Maron · Stefanie Jegelka 🔗 


CubeRep: Learning Relations Between Different Views of Data
(
Poster
)
link »
Multiview learning tasks typically seek an aggregate synthesis of multiple views or perspectives of a single data set. The current approach assumes that there is an ambient space $X$ in which the views are images of $X$ under certain functions and attempts to learn these functions via a neural network. Unfortunately, such an approach neglects to consider the geometry of the ambient space. Hierarchically hyperbolic spaces (HHSes) do, however, provide a natural multiview arrangement of data; they provide geometric tools for the assembly of different views of a single data set into a coherent global space, a \emph{CAT(0) cube complex}. In this work, we provide the first step toward theoretically justifiable methods for learning embeddings of multiview data sets into CAT(0) cube complexes. We present an algorithm which, given a finite set of finite metric spaces (views) on a finite set of points (the objects), produces the key components of an HHS structure. From this structure, we can produce a \emph{CAT(0) cube complex} that encodes the hyperbolic geometry in the data while simultaneously allowing for Euclidean features given by the detected relations among the views.

Rishi Sonthalia · Anna Gilbert · Matthew Durham 🔗 


Twodimensional visualization of large document libraries using tSNE
(
Poster
)
link »
We benchmarked different approaches for creating 2D visualizations of large document libraries, using the MEDLINE (PubMed) database of the entire biomedical literature as a use case (19 million scientific papers). Our optimal pipeline is based on logscaled TFIDF representation of the abstract text, SVD preprocessing, and tSNE with uniform affinities, early exaggeration annealing, and extended optimization. The resulting embedding distorts local neighborhoods but shows meaningful organization and rich structure on the level of narrow academic fields. 
Rita GonzálezMárquez · Philipp Berens · Dmitry Kobak 🔗 


REMuSGNN: A RotationEquivariant Model for Simulating Continuum Dynamics
(
Poster
)
link »
Numerical simulation is an essential tool in many areas of science and engineering, but its performance often limits application in practice, or when used to explore large parameter spaces. On the other hand, surrogate deep learning models, while accelerating simulations, often exhibit poor accuracy and ability to generalise. In order to improve these two factors, we introduce REMuSGNN, a rotationequivariant multiscale model for simulating continuum dynamical systems encompassing a range of length scales. REMuSGNN is designed to predict an output vector field from an input vector field on a physical domain discredited into an unstructured set of nodes. Equivariance to rotations of the domain is a desirable inductive bias that allows the network to learn the underlying physics more efficiently, leading to improved accuracy and generalisation compared with similar architectures that lack such symmetry. We demonstrate and evaluate this method on the incompressible flow around elliptical cylinders. 
Mario Lino Valencia · Stathi Fotiadis · Anil Bharath · Chris Cantwell 🔗 


Random Filters for Enriching the Discriminatory Power of Topological Representations
(
Poster
)
link »
Topological representations of data are inherently coarse summaries which endows them with certain desirable properties like stability but also potentially inhibits their discriminatory power relative to finescale learned features. In this work we present a novel framework for enriching the discriminatory power of topological representations based on random filters and capturing “interference topology” rather than direct topology. We show that our random filters outperform previously explored structured image filters while requiring orders of magnitudeless computational time. The approach is demonstrated on the MNIST dataset but is broadly applicable across data sets and modalities. This work is concluded with a discussion of the mathematical intuition underlying the approach and identification of future directions to enable deeper understanding and theoretical results. 
Tegan Emerson · Grayson Jorgenson · Henry Kvinge · Colin Olson 🔗 


Sparsifying the Update Step in Graph Neural Networks
(
Poster
)
link »
MessagePassing Neural Networks (MPNNs), the most prominent Graph Neural Network (GNN) framework, celebrate much success in the analysis of graphstructured data. Concurrently, the sparsification of Neural Network models attracts a great amount of academic and industrial interest. In this paper we conduct a structured, empirical study of the effect of sparsification on the trainable part of MPNNs known as the Update step. To this end, we design a series of models to successively sparsify the linear transform in the Update step. Specifically, we propose the ExpanderGNN model with a tuneable sparsification rate and the ActivationOnly GNN, which has no linear transform in the Update step. In agreement with a growing trend in the literature the sparsification paradigm is changed by initialising sparse neural network architectures rather than expensively sparsifying already trained architectures. Our novel benchmark models enable a better understanding of the influence of the Update step on model performance and outperform existing simplified benchmark models such as the Simple Graph Convolution. The ExpanderGNNs, and in some cases the ActivationOnly models, achieve performance on par with their vanilla counterparts on several downstream tasks, while containing significantly fewer trainable parameters. Our code is publicly available at: https://github.com/ChangminWu/ExpanderGNN. 
Johannes Lutzeyer · Changmin Wu · Michalis Vazirgiannis 🔗 


An extensible Benchmarking GraphMesh dataset for studying SteadyState Incompressible NavierStokes Equations
(
Poster
)
link »
Recent progress in Geometric Deep Learning (GDL) has shown its potential to provide powerful datadriven models. This gives momentum to explore new methods for learning physical systems governed by Partial Differential Equations (PDEs) from GraphMesh data. However, despite the efforts and recent achievements, several research directions remain unexplored and progress is still far from satisfying the physical requirements of realworld phenomena. One of the major impediments is the absence of benchmarking datasets and common physics evaluation protocols. In this paper, we propose a 2D graphmesh dataset to study the airflow over airfoils at high Reynolds regime (from $10^6$ and beyond). We also introduce metrics on the stress forces over the airfoil in order to evaluate GDL models on important physical quantities. Moreover, we provide extensive GDL baselines. Code: https://github.com/Extrality/ICLR_NACA_Dataset_V0 Dataset: https://data.isir.upmc.fr/extrality/

Florent Bonnet · Ahmed Mazari · Thibaut Munzer · Pierre Yser · patrick gallinari 🔗 


Diversified Multiscale Graph Learning with Graph SelfCorrection
(
Poster
)
link »
Though the multiscale graph learning techniques have enabled advanced feature extraction frameworks, we find that the classic ensemble strategy shows inferior performance while encountering the high homogeneity of the learnt representation, which is caused by the nature of existing graph pooling methods. To cope with this issue, we propose a diversified multiscale graph learning model equipped with two core ingredients: a graph selfcorrection mechanism to generate informative embedded graphs, and a diversity boosting regularizer to achieve a comprehensive characterization of the input graph. The proposed mechanism compensates the pooled graph with the lost information during the graph pooling process by feeding back the estimated residual graph, which serves as a plugin component for popular graph pooling methods. Meanwhile, pooling methods enhanced with the selfcorrecting procedure encourage the discrepancy of node embeddings, and thus it contributes to the success of ensemble learning strategy. The proposed regularizer instead enhances the ensemble diversity at the graphlevel embeddings by leveraging the interaction among individual classifiers. Extensive experiments on popular graph classification benchmarks show that the approaches lead to significant improvements over stateoftheart graph pooling methods, and the ensemble multiscale graph learning models achieve superior enhancement. 
Yuzhao Chen · Yatao Bian · Jiying Zhang · Xi Xiao · Tingyang Xu · Yu Rong 🔗 


Heterogeneous manifolds for curvatureaware graph embedding
(
Poster
)
link »
The quality of graph embeddings depends on whether the geometry of the space matches that of the graph. Euclidean spaces are often a poor choice and recently hyperbolic spaces and more general manifolds, such as products of constantcurvature spaces and matrix manifolds, have resulted advantageous to better matching nodes pairwise distances. However, all these manifolds are homogeneous, implying that the curvature distribution is the same at each point, making them unsuited to match the local curvature (and related structural properties) of the graph. We study embeddings in a broader class of heterogeneous rotationallysymmetric manifolds. By adding a single radial dimension to existing homogeneous models, we can both account for heterogeneous curvature distributions on graphs and pairwise distances. We evaluate our approach on reconstruction tasks. 
Francesco Di Giovanni · Giulia Luise · Michael Bronstein 🔗 


Fiber Bundle Morphisms as a Framework for Modeling ManytoMany Maps
(
Poster
)
link »
While it is not generally reflected in the `nice' datasets used for benchmarking machine learning algorithms, the realworld is full of processes that would be best described as manytomany. That is, a single input can potentially yield many different outputs (whether due to noise, imperfect measurement, or intrinsic stochasticity in the process) and many different inputs can yield the same output (that is, the map is not injective). For example, imagine a sentiment analysis task where, due to linguistic ambiguity, a single statement can have a range of different sentiment interpretations while at the same time many distinct statements can represent the same sentiment. When modeling such a multivalued function $f: X \rightarrow Y$, it is frequently useful to be able to model the distribution on $f(x)$ for specific input $x$ as well as the distribution on fiber $f^{1}(y)$ for specific output $y$. Such an analysis helps the user (i) better understand the variance intrinsic to the process they are studying and (ii) understand the range of specific input $x$ that can be used to achieve output $y$. Following existing work which used a fiber bundle framework to better model manytoone processes, we describe how morphisms of fiber bundles provide a template for building models which naturally capture the structure of manytomany processes.

Elizabeth Coda · Nico Courts · Colby Wight · Loc Truong · WoongJo Choi · Charles Godfrey · Tegan Emerson · Keerti Kappagantula · Henry Kvinge 🔗 


Persistent Toralgebra based stacking ensemble learning (PTASEL) for proteinprotein binding affinity prediction
(
Poster
)
link »
Proteinprotein interactions (PPIs) play crucial roles in almost all biological processes. Recently, Datadriven machine learning models have shown great power in the analysis of PPIs. However, efficient molecular representation and featurization are still key issues that hinder the performance of learning models. Here, we propose persistent Toralgebra (PTA), PTAbased molecular characterization and featurization, and PTAbased stacking ensemble learning (PTASEL) for PPI binding affinity prediction, for the first time. More specifically, the VietorisRipscomplex is used to characterize the PPI structure and its persistent Toralgebra is computed to form the molecular descriptors. These descriptors then are fed into our stacking model to make the prediction. We systematically test our model on the two most commonly used datasets, i.e., SKEMPI and ABBind. It has been found that our model outperforms all the existing models as far as we know, which demonstrates the great power of our model. 
Xiang LIU · KELIN XIA 🔗 


Neural Approximation of Extended Persistent Homology on Graphs
(
Poster
)
link »
Persistent homology is a widely used theory in topological data analysis. In the context of graph learning, topological features based on persistent homology have been used to capture potentially highorder structural information so as to augment existing graph neural network methods. However, computing extended persistent homology summaries remains slow for large and dense graphs. Inspired by recent success in neural algorithmic reasoning, we propose a novel learning method to compute extended persistence diagrams on graphs. The proposed neural network aims to simulate a specific algorithm and learns to compute extended persistence diagrams for new graphs efficiently. Experiments on approximating extended persistence diagrams and several downstream graph representation learning tasks demonstrate the effectiveness of our method. Our method is also efficient; on large and dense graphs, we accelerate the computation by nearly 100 times. 
Zuoyu Yan · Tengfei Ma · Liangcai Gao · Zhi Tang · Yusu Wang · Chao Chen 🔗 


RipsNet: a general architecture for fast and robust estimation of the persistent homology of point clouds
(
Poster
)
link »
The use of topological descriptors in modern machine learning applications, such as Persistence Diagrams (PDs) arising from Topological Data Analysis (TDA), has shown great potential in various domains.However, their practical use in applications is often hindered by two major limitations: the computational complexity required to compute such descriptors exactly, and their sensitivity to even lowlevel proportions of outliers. In this work, we propose to bypass these two burdens in a datadriven setting by entrusting the estimation of (vectorization of) PDs built on top of point clouds to a neural network architecture that we call RipsNet. Once trained on a given data set, RipsNet can estimate topological descriptors on test data very efficiently with generalization capacity. Furthermore, we prove that RipsNet is robust to input perturbations in terms of the 1Wasserstein distance, a major improvement over the standard computation of PDs that only enjoys Hausdorff stability, yielding RipsNet to substantially outperform exactlycomputed PDs in noisy settings. We showcase the use of RipsNet on both synthetic and realworld data. Our implementation will be made freely and publicly available as part of an opensource library. 
Thibault de Surrel · Felix Hensel · Mathieu Carrière · Théo Lacombe · Yuichi Ike · Hiroaki Kurihara · Marc Glisse · Frederic Chazal 🔗 


Pretraining Molecular Graph Representation with 3D Geometry
(
Poster
)
link »
Molecular graph representation learning is a fundamental problem in modern drug and material discovery. Molecular graphs are typically modeled by their 2D topological structures, but it has been recently discovered that 3D geometric information plays a more vital role in predicting molecular functionalities. However, the lack of 3D information in realworld scenarios has significantly impeded the learning of geometric graph representation. To cope with this challenge, we propose the Graph MultiView Pretraining (GraphMVP) framework where selfsupervised learning (SSL) is performed by leveraging the correspondence and consistency between 2D topological structures and 3D geometric views. GraphMVP effectively learns a 2D molecular graph encoder that is enhanced by richer and more discriminative 3D geometry. We further provide theoretical insights to justify the effectiveness of GraphMVP. Finally, comprehensive experiments show that GraphMVP can consistently outperform existing graph SSL methods. 
Shengchao Liu · Hanchen Wang · Weiyang Liu · Joan Lasenby · Hongyu Guo · Jian Tang 🔗 


Group Symmetry in PAC Learning
(
Poster
)
link »
In this paper we show rigorously how learning in the PAC framework with invariant or equivariant hypotheses reduces to learning in a space of orbit representatives. Our results hold for any compact group, including infinite groups such as rotations. In addition, we show how to use these equivalences to derive generalisation bounds for invariant/equivariant models in terms of the geometry of the input and output spaces. To the best of our knowledge, our results are the most general of their kind to date. 
Bryn Elesedy 🔗 


TopTemp: Parsing Precipitate Structure from Temper Topology
(
Poster
)
link »
Technological advances are in part enabled by the development of novel manufacturing processes that give rise to new materials or material property improvements. Development and evaluation of new manufacturing methodologies is labor, time, and resourceintensive expensive due to complex, poorly defined relationships between advanced manufacturing process parameters and the resulting microstructures. In this work, we present a topological representation of temper (heattreatment) dependent material microstructure, as captured by scanning electron microscopy, called TopTemp. We show that this topological representation is able to support temper classification of microstructures in a data limited setting, generalizes well to previously unseen samples, is robust to image perturbations, and captures domain interpretable features. The presented work outperforms conventional deep learning baselines and is a first step towards improving understanding of process parameters and resulting material properties. 
Tegan Emerson · Lara Kassab · Scott Howland · Henry Kvinge · Keerti Kappagantula 🔗 


A Piecewise Polynomial Filtering Approach for Graph Neural Networks
(
Poster
)
link »
Graph Neural Networks (GNNs) exploit signals from node features and the input graph topology to improve node classification task performance. Recently proposed GNNs work across a variety of homophilic and heterophilic graphs. Among these, models relying on polynomial graph filters have shown promise. We observe that polynomial filter models, in several practical instances, need to learn a reasonably high degree polynomials without facing any oversmoothing effects. We find that existing methods, due to their designs, either have limited efficacy or can be enhanced further. We present a spectral method to learn a bank of filters using a piecewise polynomial approach, where each filter acts on a different subsets of the eigen spectrum. The approach requires eigendecomposition for a few eigenvalues at extremes (i.e., low and high ends of the spectrum) and offers flexibility to learn sharper and complex shaped frequency responses with lowdegree polynomials. We theoretically and empirically show that our proposed model learns a better filter, thereby improving classification accuracy. Our model achieves performance gains of up to ~6% over the stateoftheart (SOTA) models. 
Vijay Lingam · Chanakya Ekbote · Manan Sharma · Rahul Ragesh · Arun Iyer · SUNDARARAJAN SELLAMANICKAM 🔗 


Message passing all the way up
(
Poster
)
link »
The message passing framework is the foundation of the immense success enjoyed by graph neural networks (GNNs) in recent years. In spite of its elegance, there exist many problems it provably cannot solve over given input graphs. This has led to a surge of research on going beyond message passing, building GNNs which do not suffer from those limitationsa term which has become ubiquitous in regular discourse. However, have those methods truly moved beyond message passing? In this position paper, I argue about the dangers of using this termespecially when teaching graph representation learning to newcomers. I show that any function of interest we want to compute over graphs can, in all likelihood, be expressed using pairwise message passing  just over a potentially modified graph, and argue how most practical implementations subtly do this kind of trick anyway. Hoping to initiate a productive discussion, I propose replacing beyond message passing with a more tame term, augmented message passing. 
Petar Veličković 🔗 


Neural Sheaf Diffusion: A Topological Perspective on Heterophily and Oversmoothing in GNNs
(
Poster
)
link »
Cellular sheaves equip graphs with ``geometrical'' structure by assigning vector spaces and linear maps to nodes and edges. Graph Neural Networks (GNNs) implicitly assume a graph with a trivial underlying sheaf. This choice is reflected in the structure of the graph Laplacian operator, the properties of the associated diffusion equation, and the characteristics of the convolutional models that discretise this equation. In this paper, we use cellular sheaf theory to show that the underlying geometry of the graph is deeply linked with the performance of GNNs in heterophilic settings and their oversmoothing behaviour. By considering a hierarchy of increasingly general sheaves, we study how the ability of the sheaf diffusion process to achieve linear separation of the classes in the infinite time limit expands. At the same time, we prove that when the sheaf is nontrivial, discretised parametric diffusion processes have greater control than GNNs over their asymptotic behaviour. On the practical side, we study how sheaves can be learned from data. The resulting sheaf diffusion models have many desirable properties that address the limitations of classical graph diffusion equations (and corresponding GNN models) and obtain stateoftheart results in heterophilic settings. Overall, our work provides new connections between GNNs and algebraic topology and would be of interest to both fields. 
Cristian Bodnar · Francesco Di Giovanni · Benjamin Chamberlain · Pietro Lio · Michael Bronstein 🔗 