Workshop
Deep Reinforcement Learning Meets Structured Prediction
Chen Liang · Ni Lao · Wang Ling · Zita Marinho · Yuandong Tian · Lu Wang · Jason D Williams · Audrey Durand · Andre Martins
Room R02
Mon 6 May, 7:45 a.m. PDT
Website: https://sites.google.com/view/iclr2019-drlstructpred
ICLR page: https://iclr.cc/Conferences/2019/Schedule?showEvent=630
Submission website: https://openreview.net/group?id=ICLR.cc/2019/Workshop/drlStructPred
Important Dates
Submission open March 6
Submission deadline March 15 (11:59pm AOE)
Decisions April 6
Camera Ready April 28 (11:59pm AOE)
Workshop May 6
Deep reinforcement learning has achieved successes on numerous tasks such as computer games, the game of Go, robotics, etc. Structured prediction aims at modeling highly dependent variables, which applies to a wide range of domains such as natural language processing, computer vision, computational biology, etc. In many cases, structured prediction can be viewed as a sequential decision making process, so a natural question is can we leverage the advances in deep RL to improve structured prediction?
Recently, promising results have been shown applying RL to various structured prediction problems such as dialogue (Li et al, 2016; Williams et al, 2017; He et al, 2017), program synthesis (Bunel et al, 2018; Liang et al, 2018), semantic parsing (Liang et al, 2017), architecture search (Zoph & Le, 2017), chunking and parsing (Sharaf & Daumé III 2018), machine translation (Ranzato et al, 2015; Norouzi et al, 2016; Bahdanau et al, 2016), summarization (Paulus et al, 2017), image caption (Rennie et al, 2017), knowledge graph reasoning (Xiong et al, 2017), query rewriting (Nogueira et al, 2017; Buck et al, 2017) and information extraction (Narasimhan et al, 2016; Qin et al, 2018). However, there are also negative results where RL is not efficient enough comparing to alternative approaches (Guu et al, 2017; Bender et al, 2018; Xu et al, 2018). As a community it is very important to figure out the limit and future directions of RL in structured prediction.
This workshop will bring together experts in structured predictions and reinforcement learning. Specifically, it will provide an overview of existing approaches from various domains to distill generally applicable principles from their successes. We will also discuss the main challenges arising in this setting and outline potential directions for future progress. The target audience consists of researchers and practitioners in this areas. They include, but are not limited to, deep RL for:
dialogue
semantic parsing
program synthesis
architecture search
machine translation
summarization
image caption
knowledge graph reasoning
information extraction
Area: Reinforcement Learning, Applications
Accepted papers:
Connecting the Dots Between MLE and RL for Sequence Generation, Bowen Tan, Zhiting Hu, Zichao Yang, Ruslan Salakhutdinov, Eric P. Xing
Buy 4 REINFORCE Samples, Get a Baseline for Free!, Wouter Kool, Herke van Hoof, Max Welling
Learning proposals for sequential importance samplers using reinforced variational inference, Zafarali Ahmed, Arjun Karuvally, Doina Precup, Simon Gravel
Learning Neurosymbolic Generative Models via Program Synthesis, Halley Young, Osbert Bastani, Mayur Naik
Multi-agent query reformulation: Challenges and the role of diversity, Rodrigo Nogueira, Jannis Bulian, Massimiliano Ciaramita
A Study of State Aliasing in Structured Prediction with RNNs, Layla El Asri, Adam Trischler
Neural Program Planner for Structured Predictions, Jacob Biloki, Chen Liang, Ni Lao
Robust Reinforcement Learning for Autonomous Driving, Yesmina Jaafra, Jean Luc Laurent, Aline Deruyver, Mohamed Saber Naceur
References:
Sutton, Richard S., and Andrew G. Barto. (1998). Reinforcement learning: An introduction. Vol. 1. No. 1. Cambridge: MIT press.
Hal Daumé III, John Langford and Daniel Marcu. (2009). Search-based Structured Prediction. Machine Learning Journal.
Hal Daumé III. (2017). Structured prediction is not RL. Blogspot.
He, Di, et al. (2016). Dual learning for machine translation. NIPS.
Ranzato, Marc'Aurelio, et al. (2015). Sequence level training with recurrent neural networks. arXiv preprint arXiv:1511.06732.
Y. Efroni, G. Dalal, B. Scherrer, S. Mannor. (2019). How to Combine Tree-Search Methods in Reinforcement Learning, AAAI.
Bahdanau, Dzmitry, et al. (2016). An actor-critic algorithm for sequence prediction. arXiv preprint arXiv:1607.07086.
Bunel, Rudy, et al. (2018). Leveraging grammar and reinforcement learning for neural program synthesis. arXiv preprint arXiv:1805.04276.
Buck, Christian, et al. (2017) Ask the right questions: Active question reformulation with reinforcement learning. arXiv preprint arXiv:1705.07830..
Nogueira, Rodrigo, and Kyunghyun Cho. (2017). Task-oriented query reformulation with reinforcement learning. arXiv preprint arXiv:1704.04572.
Paulus Romain, Caiming Xiong, and Richard Socher. (2017). A deep reinforced model for abstractive summarization. arXiv preprint arXiv:1705.04304.
Norouzi, Mohammad et al. (2016) Reward Augmented Maximum Likelihood for Neural Structured Prediction. NIPS.
Williams, Jason D., Kavosh Asadi, and Geoffrey Zweig. (2017). Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Vol. 1.
Li, Jiwei, et al. (2016) Deep reinforcement learning for dialogue generation. arXiv preprint arXiv:1606.01541.
Kirthevasan Kandasamy, Yoram Bachrach, Ryota Tomioka, Daniel Tarlow, David Carter. (2017). Batch Policy Gradient Methods for Improving Neural Conversation Models. ICLR.
Narasimhan, K., Yala, A., & Barzilay, R. (2016). Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (pp. 2355-2365).
Rennie, Steven J., et al. (2017). Self-critical sequence training for image captioning. CVPR. Vol. 1. No. 2.
Michael Gygli, Mohammad Norouzi, Anelia Angelova. (2017). Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs. ICML.
Barret Zoph, Quoc V. Le. (2017). Neural Architecture Search with Reinforcement Learning. ICLR.
Chen Liang, Jonathan Berant, Quoc Le, Kenneth D. Forbus, Ni Lao. (2017). Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision. ACL.
Kelvin Guu, Panupong Pasupat, Evan Zheran Liu, Percy Liang. (2017). From language to programs: Bridging reinforcement learning and maximum marginal likelihood. ACL.
Daniel A Abolafia, Mohammad Norouzi, Jonathan Shen, Rui Zhao, Quoc V. Le. (2018). Neural Program Synthesis with Priority Queue Training.
Chen Liang, Mohammad Norouzi, Jonathan Berant, Quoc Le, Ni Lao. (2018) Memory Augmented Policy Optimization for Program Synthesis with Generalization. NeurPS.
CJ Maddison, D Lawson, G Tucker*, N Heess, M Norouzi, A Doucet, A Mnih, YW Teh. (2017). Filtering Variational Objectives. NIPS.
Dieterich Lawson, Chung-Cheng Chiu, George Tucker, Colin Raffel, Kevin Swersky, Navdeep Jaitly. (2018). Learning hard alignments with variational inference. ICASSP.
Gabriel Bender, Pieter-Jan Kindermans, Barret Zoph, Vijay Vasudevan, Quoc Le. (2018). Understanding and simplifying one-shot architecture search. ICML.
Hoang M. Le, Nan Jiang, Alekh Agarwal, Miroslav Dudík, Yisong Yue, Hal Daumé III. (2018). Hierarchical Imitation and Reinforcement Learning. ICML.
Amr Sharaf, Hal Daumé III. (2017). Structured prediction via learning to search under bandit feedback. SP4NLP workshop.
Xiaojun Xu, Chang Liu, Dawn Song. (2018). Sqlnet: Generating structured queries from natural language without reinforcement learning.
W Xiong, T Hoang, WY Wang DeepPath. (2017). A Reinforcement Learning Method for Knowledge Graph Reasoning. EMNLP.
Pengda Qin, Weiran Xu, William Yang Wang. (2018). Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning. ACL.
D. Bahdanau, P. Brakel, K. Xu, A. Goyal, R. Lowe, J. Pineau, A. Courville, Y. Bengio. (2017). An Actor-Critic Algorithm for Sequence Prediction. ICLR.
Live content is unavailable. Log in and register to view live content