Spiking Neural Networks for Continuous Control: Neuromorphic Reinforcement Learning in Conventional Computing
Abstract
Reinforcement learning (RL) algorithms have made strides over the past decade applying them to a wide range of problems and control tasks. While the primary improvements noted are those in discrete environments, using techniques such as convolutional neural networks, the concerns of their continuous environment counterparts have lagged behind. Neuromorphic hardware implementations have shared in these struggles, failing to make progress in implementing RL frameworks that operate consistently in continuous environments. Key implementations of Spiking Neural Networks (SNNs) work to solve these problems by offering alternative strategies to implement neural networks on neuromorphic hardware. In this paper, we propose the Spiking Actor Network Soft Actor Critic (SANSAC) to address the use of RL frameworks in continuous environments, designed as a framework that can be implemented on neuromorphic hardware. We compare a traditional Soft Actor Critic (SAC) network to SANSAC in a traditional computer. We demonstrate the near equivalent performance of SANSAC and SAC, while addressing the impact of hidden dimensions. Our results demonstrate the viability of SNN based algorithms in complex continuous environments, as well as competitive performance to traditional neural networks in traditional computers, providing a basis to continue exploring the use of SNNs in continuous RL frameworks.