ICLR Poster Risk-Sensitive Variational Actor-Critic: A Model-Based Approach

Poster

Risk-Sensitive Variational Actor-Critic: A Model-Based Approach

Alonso Granados · Mohammadreza Ebrahimi · Jason Pacheco

Hall 3 + Hall 2B #414

[ Abstract ]

Fri 25 Apr 7 p.m. PDT — 9:30 p.m. PDT

Abstract:

Risk-sensitive reinforcement learning (RL) with an entropic risk measure typically requires knowledge of the transition kernel or performs unstable updates w.r.t. exponential Bellman equations. As a consequence, algorithms that optimize this objective have been restricted to tabular or low-dimensional continuous environments. In this work we leverage the connection between the entropic risk measure and the RL-as-inference framework to develop a risk-sensitive variational actor-critic algorithm (rsVAC). Our work extends the variational framework to incorporate stochastic rewards and proposes a variational model-based actor-critic approach that modulates policy risk via a risk parameter. We consider, both, the risk-seeking and risk-averse regimes and present rsVAC learning variants for each setting. Our experiments demonstrate that this approach produces risk-sensitive policies and yields improvements in both tabular and risk-aware variants of complex continuous control tasks in MuJoCo.

Live content is unavailable. Log in and register to view live content