firstbacksecondback
154 Results
Poster
|
Thu 17:00 |
Learning to Sample with Local and Global Contexts in Experience Replay Buffer Youngmin Oh · Kimin Lee · Jinwoo Shin · Eunho Yang · Sung Ju Hwang |
|
Poster
|
Tue 17:00 |
DOP: Off-Policy Multi-Agent Decomposed Policy Gradients Yihan Wang · Beining Han · Tonghan Wang · Heng Dong · Chongjie Zhang |
|
Poster
|
Mon 1:00 |
Parameter-Based Value Functions Francesco Faccio · Louis Kirsch · Jürgen Schmidhuber |
|
Poster
|
Thu 1:00 |
Representation Balancing Offline Model-based Reinforcement Learning Byung-Jun Lee · Jongmin Lee · Kee-Eung Kim |
|
Poster
|
Thu 17:00 |
Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and Dual Bounds Yihao Feng · Ziyang Tang · Na Zhang · Qiang Liu |
|
Poster
|
Tue 1:00 |
Policy-Driven Attack: Learning to Query for Hard-label Black-box Adversarial Examples Ziang Yan · Yiwen Guo · Jian Liang · Changshui Zhang |
|
Poster
|
Wed 9:00 |
Benchmarks for Deep Off-Policy Evaluation Justin Fu · Mohammad Norouzi · Ofir Nachum · George Tucker · ziyu wang · Alexander Novikov · Sherry Yang · Michael Zhang · Yutian Chen · Aviral Kumar · Cosmin Paduraru · Sergey Levine · Thomas Paine |
|
Poster
|
Tue 17:00 |
Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization Michael Zhang · Thomas Paine · Ofir Nachum · Cosmin Paduraru · George Tucker · ziyu wang · Mohammad Norouzi |
|
Poster
|
Thu 1:00 |
Genetic Soft Updates for Policy Evolution in Deep Reinforcement Learning Enrico Marchesini · Davide Corsi · Alessandro Farinelli |
|
Spotlight
|
Thu 3:45 |
Iterative Empirical Game Solving via Single Policy Best Response Max Smith · Thomas Anthony · Michael Wellman |
|
Poster
|
Tue 9:00 |
Iterative Empirical Game Solving via Single Policy Best Response Max Smith · Thomas Anthony · Michael Wellman |
|
Spotlight
|
Thu 19:25 |
Self-Supervised Policy Adaptation during Deployment Nicklas Hansen · Rishabh Jangir · Yu Sun · Guillem Alenyà · Pieter Abbeel · Alexei Efros · Lerrel Pinto · Xiaolong Wang |