Skip to yearly menu bar Skip to main content


Poster

Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP

Liwei Wang · Yuanhao Wang · Kefan Dong · Xiaoyu Chen

Abstract

Chat is not available.