Why Propagate Alone? Parallel Use of Labels and Features on Graphs

Yangkun Wang · Jiarui Jin · Weinan Zhang · Yang Yongyi · Jiuhai Chen · Quan Gan · Yong Yu · Zheng Zhang · Zengfeng Huang · David Wipf

[ Abstract ]
[ Visit Poster at Spot G1 in Virtual World ] [ OpenReview
Tue 26 Apr 6:30 p.m. PDT — 8:30 p.m. PDT


One of the challenges of graph-based semi-supervised learning over ordinary supervised learning for classification tasks lies in label utilization. The direct use of ground-truth labels in graphs for training purposes can result in a parametric model learning trivial degenerate solutions (e.g., an identity mapping from input to output). In addressing this issue, a label trick has recently been proposed in the literature and applied to a wide range of graph neural network (GNN) architectures, achieving state-of-the-art results on various datasets. The essential idea is to randomly split the observed labels on the graph and use a fraction of them as input to the model (along with original node features), and predict the remaining fraction. Despite its success in enabling GNNs to propagate features and labels simultaneously, this approach has never been analyzed from a theoretical perspective, nor fully explored across certain natural use cases. In this paper, we demonstrate that under suitable settings, this stochastic trick can be reduced to a more interpretable deterministic form, allowing us to better explain its behavior, including an emergent regularization effect, and motivate broader application scenarios. Our experimental results corroborate these analyses while also demonstrating improved node classification performance applying the label trick in new domains.

Chat is not available.