Gluing Local Contexts into Global Meaning: A Sheaf-Theoretic Decomposition of Transformer Representations
Bryce Grant ⋅ Peng Wang
Abstract
We decompose transformer activations into content-stable ($H^0$) and context-dependent ($H^1$) subspaces using sheaf cohomology. A cellular sheaf built over paraphrase graphs yields a Laplacian whose spectral structure separates phrasing-invariant directions from maximally varying ones, requiring no concept labels or supervised training. Across five models (124M--13B parameters), $H^1$ dimensions exert $3.5$--$26.5\times$ greater causal influence on model output than variance-matched controls (Cohen's $d = 2.3$--$14.3$), $H^0$ retrieves facts at 60--68\% accuracy using only 20 dimensions, and the two subspaces produce opposite effects under ablation. The decomposition also reveals architecture-dependent fragility: Llama-2-7B collapses under random perturbation (4.2\% fact preservation) while all directed methods preserve facts at 12--14\% ($p < 10^{-10}$, $n$=1000); with architecture-specific restriction maps this gap widens to 31.0\% vs.\ 4.2\% ($p < 10^{-50}$). Robust models tolerate both perturbation types. Project page: https://cwru-aism.github.io/gluing-lc-page/
Chat is not available.
Successful Page Load