Canonical Tree Cover Neural Networks for Expressive and Invariant Graph Learning
Abstract
While message-passing NNs (MPNNs) are naturally invariant on graphs, they are fundamentally limited in expressive power, oversmooth, and oversquash. Canonicalization offers a powerful alternative by mapping each graph to a unique, invariant representation on which expressive non-invariant encoders can operate. However, existing approaches rely on a single canonical sequence that distorts graph distances and restricts expressivity. To address these limitations, we introduce Canonical Tree Cover Neural Networks (CTNNs), which represent the graph with a canonical spanning tree cover. Each tree is then processed with an expressive tree encoder. Theoretically, tree covers better preserve graph distances in comparison to sequences, and on sparse graphs, the cover recovers all edges with a logarithmic number of trees in the graph size, making CTNNs strictly more expressive than sequence-based canonicalization approaches. Empirically, CTNNs consistently outperform invariant GNNs and sequence-based canonical GNNs across sparse molecular and protein graph classification benchmarks. Overall, CTNNs advance graph learning by providing an efficient, invariant, and expressive representation learning framework on sparse graphs via tree cover-based canonicalization.