Poster
in
Workshop: Frontiers in Probabilistic Inference: learning meets Sampling

Inherent Exploration via Sampling for Stochastic Policies

Zhenpeng Shi · Chi Xu · Huaze Tang · Wenbo Ding

Project Page [ OpenReview]

Abstract

In this paper, we propose a novel exploration strategy for reinforcement learning in continuous action spaces by controlling the sampling strategy of stochastic policies. The proposed method, Inherent Exploration via Sampling (IES), enhances exploration by diversifying actions through the selection of varied Gaussian inputs.IES leverages the inherent stochasticity of policies to improve exploration without relying on external bonuses. Furthermore, it integrates seamlessly with existing exploration methods, introducing negligible computational overhead.Theoretically, we prove that IES achieves $\mathcal{O}\left(\epsilon^{-3}\right)$ sample complexity under the actor-critic framework in continuous action spaces. Experimentally, we evaluate IES on Gaussian policies (e.g., Soft Actor-Critic, Proximal Policy Optimization) and consistency-based policies for continuous control benchmarks mujoco, dm\_control and isaacgym.The results demonstrate that IES effectively enhances the exploration capabilities of different policies, thereby improving the convergence of various reinforcement learning algorithms.

Chat is not available.