Skip to yearly menu bar Skip to main content


Poster

Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP

Liwei Wang ⋅ Yuanhao Wang ⋅ Kefan Dong ⋅ Xiaoyu Chen

Abstract

Chat is not available.