Imitation Learning by Reinforcement Learning

Kamil Ciosek

Keywords: [ reinforcement learning ] [ continuous control ] [ imitation learning ] [ markov decision process ]

[ Abstract ]
[ Visit Poster at Spot A2 in Virtual World ] [ OpenReview
Tue 26 Apr 2:30 a.m. PDT — 4:30 a.m. PDT


Imitation learning algorithms learn a policy from demonstrations of expert behavior. We show that, for deterministic experts, imitation learning can be done by reduction to reinforcement learning with a stationary reward. Our theoretical analysis both certifies the recovery of expert reward and bounds the total variation distance between the expert and the imitation learner, showing a link to adversarial imitation learning. We conduct experiments which confirm that our reduction works well in practice for continuous control tasks.

Chat is not available.