Skip to yearly menu bar Skip to main content


Poster

Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy

Yuan Xie · Boyi Liu · Qiang Liu · Zhaoran Wang · Yuan Zhou · Jian Peng
2019 Poster
[ PDF

Abstract

Chat is not available.