Poster
in
Workshop: Latent & Implicit Thinking – Going Beyond CoT Reasoning Mon, Apr 27, 2026 • 7:40 AM – 8:30 AM PDT

Cross-Layer Clustering for Stochastic Parameter Decomposition

saman seshadri ⋅ Jack Digilov ⋅ Sean Esla ⋅ Nathan Hu ⋅ Michael Ivanitskiy ⋅ Pablo Bernabeu-Perez

Project Page [ OpenReview]

Abstract

Mechanistic interpretability seeks to decompose neural networks into interpretable circuits. Stochastic parameter decomposition (Bushnaq et al., 2025, SPD) yields sparse, atomic subcomponents within layers but does not capture the multi-layer pathways driving complex behavior. We propose a cross-layer spectral clustering framework that automatically discovers these distributed mechanisms by analyzing co-activation patterns across inputs. By measuring the Pearson correlation of importance scores between subcomponents, we construct a similarity graph that links disjoint parts of the network contributing to the same computational task. On synthetic models with known circuits, our method successfully recovers the ground-truth mechanistic structure confirming its ability to identify cross-layer dependencies. When applied to small language models, we find multi-layer clusters whose top-activating examples suggest consistent linguistic functions (e.g., tracking salient entities and tense morphology). These clusters serve as high-quality hypotheses for follow-up causal tests, providing a scalable step toward discovering system-level mechanisms in language models.

Chat is not available.