Bio-Inspired Spatial Reasoning Transformer: Grid Cells, Place Cells, and Attractor Dynamics for Text-Based Spatial Understanding
Abstract
Transformers struggle with spatial reasoning despite strong language understanding. We hypothesize this stems from lack of spatial inductive bias. Inspired by the mammalian hippocampal system, we introduce the Spatial Reasoning Transformer (SRT), integrating three bio-inspired modules: GridPE (grid cell encoding), PlaceNet (place cell memory), and AttractorAttn (attractor dynamics attention). Each module is theoretically grounded with proven guarantees. On text-based spatial reasoning benchmarks, SRT achieves 8.1% improvement on SpaRTUN. Ablations reveal that AttractorAttn contributes 63.6% of gains on complex relations, while GridPE benefits coordinate tracking tasks. Our work demonstrates that bio-inspired inductive biases can enhance Transformer spatial reasoning.