ICLR Is Mamba Capable of In-Context Learning?

Poster
in
Workshop: 2nd Workshop on Mathematical and Empirical Understanding of Foundation Models

Is Mamba Capable of In-Context Learning?

Riccardo Grazzi · Julien Siems · Simon Schrodi · Thomas Brox · Frank Hutter

[ Abstract ] [ Project Page ]

[ Poster] [ OpenReview]

Abstract:

This work provides empirical evidence that Mamba, a newly proposed selective structured state space model, has similar in-context learning (ICL) capabilities as transformers. We evaluated Mamba on tasks involving simple function approximation as well as more complex natural language processing problems. Our results demonstrate that across both categories of tasks, Mamba matches the performance of transformer models for ICL. Further analysis reveals that like transformers, Mamba appears to solve ICL problems by incrementally optimizing its internal representations. Overall, our work suggests that Mamba can be an efficient alternative to transformers for ICL tasks involving longer input sequences.

Chat is not available.

Poster in Workshop: 2nd Workshop on Mathematical and Empirical Understanding of Foundation Models

Is Mamba Capable of In-Context Learning?

Riccardo Grazzi · Julien Siems · Simon Schrodi · Thomas Brox · Frank Hutter

Poster
in
Workshop: 2nd Workshop on Mathematical and Empirical Understanding of Foundation Models