Poster
in
Workshop: 2nd Workshop on Mathematical and Empirical Understanding of Foundation Models
Lessons from Identifiability for Understanding Large Language Models
Patrik Reizinger · Szilvia Ujváry · Anna Mészáros · Anna Kerekes · Wieland Brendel · Ferenc Huszar
Many interesting properties emerge in LLMs, including rule extrapolation, in-context learning, and data-efficient fine-tunability. We demonstrate that good statistical generalization alone cannot explain these phenomena due to the inherent non-identifiability of autoregressive (AR) probabilistic models. Indeed, models zero or near-zero KL divergence apart---thus, equivalent test loss---can exhibit markedly different behaviours. We illustrate the practical implications for AR LLMs regarding three types of non-identifiability: (1) the non-identifiability of zero-shot rule extrapolation; (2) the approximate non-identifiability of in-context learning; and (3) the non-identifiability of fine-tunability. We hypothesize these important properties in LLMs are induced by inductive biases.