ReinforceWalk: Learning to Walk in Graph with Monte Carlo Tree Search

Workshop

ReinforceWalk: Learning to Walk in Graph with Monte Carlo Tree Search

Yelong Shen · Jianshu Chen · Po-Sen Huang · Yuqing Guo · Jianfeng Gao

[ Abstract ]

[ PDF]

We consider the problem of learning to walk over a graph towards a target node for a given input query and a source node (e.g., knowledge graph reasoning). We propose a new method called ReinforceWalk, which consists of a deep recurrent neural network (RNN) and a Monte Carlo Tree Search (MCTS). The RNN encodes the history of observations and map it into the Q-value, the policy and the state value. The MCTS is combined with the RNN policy to generate trajectories with more positive rewards, overcoming the sparse reward problem. Then, the RNN policy is updated in an off-policy manner from these trajectories. ReinforceWalk repeats these steps to learn the policy. At testing stage, the MCTS is also combined with the RNN to predict the target node with higher accuracy. Experiment results show that we are able to learn better policies from less number of rollouts compared to other methods, which are mainly based on policy gradient method.

Chat is not available.