ICLR On the Representation Gap Between Modern RNNs and Transformers: The Curse of Memory Efficiency and the Fix of In-Context Retrieval

Poster
in
Workshop: Bridging the Gap Between Practice and Theory in Deep Learning

On the Representation Gap Between Modern RNNs and Transformers: The Curse of Memory Efficiency and the Fix of In-Context Retrieval

Kaiyue Wen · Xingyu Dang · Kaifeng Lyu

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract: This paper investigates the limitations of Recurrent Neural Networks (RNNs) in algorithmic tasks, particularly in comparison with Transformers. Focusing on a reasoning task IsTree deciding whether a graph is a tree, we demonstrate that RNNs with

o (n)

$o(n)$ parameters, even with Chain-of-Thought (CoT), cannot solve this task for graphs with size

n

$n$ , unlike Transformers which can solve the task with CoT and only

O (\log n)

$O(\log n)$ bit parameters. Our experiments confirm this representation gap. To overcome this limitation, we propose augmenting RNNs with in-context retrieval capabilities, specifically using regular expressions. This enhancement enables RNNs to solve IsTree and other algorithmic problems in

P

$\mathsf{P}$ , maintaining their memory efficiency and closing the gap with Transformers.

Chat is not available.

Poster in Workshop: Bridging the Gap Between Practice and Theory in Deep Learning

On the Representation Gap Between Modern RNNs and Transformers: The Curse of Memory Efficiency and the Fix of In-Context Retrieval

Kaiyue Wen · Xingyu Dang · Kaifeng Lyu

Poster
in
Workshop: Bridging the Gap Between Practice and Theory in Deep Learning