Learning to Plan in High Dimensions via Neural Exploration-Exploitation Trees

Binghong Chen, Bo Dai, Qinjie Lin, Guo Ye, Han Liu, Le Song

Keywords: meta learning, planning, reinforcement learning, representation learning, sample efficiency

Wednesday: RL and Planning

Abstract: We propose a meta path planning algorithm named \emph{Neural Exploration-Exploitation Trees~(NEXT)} for learning from prior experience for solving new path planning problems in high dimensional continuous state and action spaces. Compared to more classical sampling-based methods like RRT, our approach achieves much better sample efficiency in high-dimensions and can benefit from prior experience of planning in similar environments. More specifically, NEXT exploits a novel neural architecture which can learn promising search directions from problem structures. The learned prior is then integrated into a UCB-type algorithm to achieve an online balance between \emph{exploration} and \emph{exploitation} when solving a new problem. We conduct thorough experiments to show that NEXT accomplishes new planning problems with more compact search trees and significantly outperforms state-of-the-art methods on several benchmarks.

Similar Papers

Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets
Mingrui Liu, Youssef Mroueh, Jerret Ross, Wei Zhang, Xiaodong Cui, Payel Das, Tianbao Yang,