Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Workshop on Spurious Correlation and Shortcut Learning: Foundations and Solutions

BEYOND DECODABILITY: LINEAR FEATURE SPACES ENABLE VISUAL COMPOSITIONAL GENERALIZATION

Arnas Uselis · Andrea Dittadi · Seong Joon Oh

Keywords: [ compositionality ] [ OOD generalization ]


Abstract:

While compositional generalization is fundamental to human intelligence, we still lack understanding of how neural networks combine learned representations of parts into novel wholes. We investigate whether neural networks express representations as linear sums of simpler constituent parts. Our analysis reveals that models trained from scratch often exhibit decodability, where the features can be linearly decoded to perform well, but may lack linear structure, preventing the models from generalizing zero-shot. Instead, linearity of representations only arises with high training data diversity. We prove that when representations are linear, perfect generalization to novel concept combinations is possible with minimal training data. Empirically evaluating large-scale pretrained models through this lens reveals that they achieve strong generalization for certain concept types while still falling short of the ideal linear structure for others.

Chat is not available.