firstbacksecondback
2 Results
Poster
|
Wed 9:00 |
Two-Timescale Networks for Nonlinear Value Function Approximation Wesley Chung · Somjit Nath · Ajin Joseph · Martha White |
|
Poster
|
Wed 14:30 |
Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy Yuan Xie · Boyi Liu · Qiang Liu · Zhaoran Wang · Yuan Zhou · Jian Peng |