ICLR Graphical Clusterability and Local Specialization in Deep Neural Networks

Poster
in
Workshop: PAIR^2Struct: Privacy, Accountability, Interpretability, Robustness, Reasoning on Structured Data

Graphical Clusterability and Local Specialization in Deep Neural Networks

Stephen Casper · Shlomi Hod · Daniel Filan · Cody Wild · Andrew Critch · Stuart Russell

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract:

The learned weights of deep neural networks have often been considered devoid of scrutable internal structure, and tools for studying them have not traditionally relied on techniques from network science. In this paper, we present methods for studying structure among a network's neurons by clustering them and for quantifying how well this reveals both graphical clusterability and local specialization -- the degree to which the network can be understood as having distinct, highly internally connected subsets of neurons that perform subtasks. We offer a pipeline for this analysis consisting of methods for (1) representing a network as a graph, (2) clustering that graph, and (3) performing statistical analysis to determine how graphically clusterable and (4) functionally specialized the clusters are. We demonstrate that image classification networks up to the ImageNet-scale are often highly clusterable and locally specialized.

Chat is not available.

Poster in Workshop: PAIR^2Struct: Privacy, Accountability, Interpretability, Robustness, Reasoning on Structured Data

Graphical Clusterability and Local Specialization in Deep Neural Networks

Stephen Casper · Shlomi Hod · Daniel Filan · Cody Wild · Andrew Critch · Stuart Russell

Poster
in
Workshop: PAIR^2Struct: Privacy, Accountability, Interpretability, Robustness, Reasoning on Structured Data