The Expressive Limits of Diagonal SSMs for State-Tracking
Mehran Shakerinava · Behnoush Khavari · Siamak Ravanbakhsh · Sarath Chandar
Abstract
State-Space Models (SSMs) have recently been shown to achieve strong empirical performance on a variety of long-range sequence modeling tasks while remaining efficient and highly-parallelizable. However, the theoretical understanding of their expressive power remains limited. In this work, we study the expressivity of input-Dependent Complex-valued Diagonal (DCD) State-Space Models (SSMs) on sequential state-tracking tasks for abstract groups. It is easy to show that a single DCD SSM layer with a universal decoder can track any Abelian group at finite precision by decomposing it into a product of cyclic groups. We show that this is tight by proving that such a model cannot track any non-Abelian group at finite precision. We further establish the expressivity of multi-layer DCD SSMs. We show that a $k$-layer DCD SSM tracks a group if and only if that group has a subnormal series of length at most $k$, with Abelian factor groups. Empirically, while multi-layer models are theoretically expressive enough for solvable non-Abelian groups, we find they often fail to learn such solutions in practice, highlighting a gap between expressivity and learnability.
Successful Page Load