ICLR 2018 Papers

Layout:

mini compact topic detail

NerveNet: Learning Structured Policy with Graph Neural Networks

Sensitivity and Generalization in Neural Networks: an Empirical Study

On the Information Bottleneck Theory of Deep Learning

Latent Constraints: Learning to Generate Conditionally from Unconditional Generative Models

Semi-parametric topological memory for navigation

Memory-based Parameter Adaptation

Routing Networks: Adaptive Selection of Non-Linear Functions for Multi-Task Learning

Decision Boundary Analysis of Adversarial Examples

Hierarchical Subtask Discovery with Non-Negative Matrix Factorization

A Framework for the Quantitative Evaluation of Disentangled Representations

Demystifying MMD GANs

Imitation Learning from Visual Data with Multiple Intentions

Mixed Precision Training of Convolutional Neural Networks using Integer Operations

Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning

Beyond Shared Hierarchies: Deep Multitask Learning through Soft Layer Ordering

Attacking Binarized Neural Networks

On the insufficiency of existing momentum schemes for Stochastic Optimization

Hierarchical Representations for Efficient Architecture Search

Divide and Conquer Networks

Few-Shot Learning with Graph Neural Networks

WHAI: Weibull Hybrid Autoencoding Inference for Deep Topic Modeling

Quantitatively Evaluating GANs With Divergences Proposed for Training

The power of deeper networks for expressing natural functions

Simulated+Unsupervised Learning With Adaptive Data Generation and Bidirectional Mappings

Lifelong Learning with Dynamically Expandable Networks

Generative networks as inverse problems with Scattering transforms

Flipout: Efficient Pseudo-Independent Weight Perturbations on Mini-Batches

Learn to Pay Attention

TRUNCATED HORIZON POLICY SEARCH: COMBINING REINFORCEMENT LEARNING & IMITATION LEARNING

Syntax-Directed Variational Autoencoder for Structured Data

Auto-Encoding Sequential Monte Carlo

Parametrized Hierarchical Procedures for Neural Programming

Reinforcement Learning on Web Interfaces using Workflow-Guided Exploration

LEARNING TO SHARE: SIMULTANEOUS PARAMETER TYING AND SPARSIFICATION IN DEEP LEARNING

A DIRT-T Approach to Unsupervised Domain Adaptation

Improving GAN Training via Binarized Representation Entropy (BRE) Regularization

Unsupervised Machine Translation Using Monolingual Corpora Only

Towards Image Understanding from Deep Compression Without Decoding

Communication Algorithms via Deep Learning

Spatially Transformed Adversarial Examples

On the regularization of Wasserstein GANs

Learning Awareness Models

Relational Neural Expectation Maximization: Unsupervised Discovery of Objects and their Interactions

Scalable Private Learning with PATE

Kronecker-factored Curvature Approximations for Recurrent Neural Networks

Kernel Implicit Variational Inference

MaskGAN: Better Text Generation via Filling in the _______

Maximum a Posteriori Policy Optimisation

Fraternal Dropout

Improving the Improved Training of Wasserstein GANs: A Consistency Term and Its Dual Effect

Universal Agent for Disentangling Environments and Tasks

Deep Complex Networks

Learning Parametric Closed-Loop Policies for Markov Potential Games

Synthesizing realistic neural population activity patterns using Generative Adversarial Networks

Model compression via distillation and quantization

Active Neural Localization

Towards Synthesizing Complex Programs From Input-Output Examples

Decision-Based Adversarial Attacks: Reliable Attacks Against Black-Box Machine Learning Models

SMASH: One-Shot Model Architecture Search through HyperNetworks

Word translation without parallel data

Consequentialist conditional cooperation in social dilemmas with imperfect information

Natural Language Inference over Interaction Space

Learning to cluster in order to transfer across domains and tasks

Compressing Word Embeddings via Deep Compositional Code Learning

Spectral Normalization for Generative Adversarial Networks

Training and Inference with Integers in Deep Neural Networks

Empirical Risk Landscape Analysis for Understanding Deep Neural Networks

Characterizing Adversarial Subspaces Using Local Intrinsic Dimensionality

Automatically Inferring Data Quality for Spatiotemporal Forecasting

Distributed Fine-tuning of Language Models on Private Data

A New Method of Region Embedding for Text Classification

Cascade Adversarial Machine Learning Regularized with a Unified Embedding

Learning how to explain neural networks: PatternNet and PatternAttribution

Memory Augmented Control Networks

Boosting Dilated Convolutional Networks with Mixed Tensor Decompositions

When is a Convolutional Filter Easy to Learn?

Regularizing and Optimizing LSTM Language Models

Rethinking the Smaller-Norm-Less-Informative Assumption in Channel Pruning of Convolution Layers

Learning Discrete Weights Using the Local Reparameterization Trick

Recasting Gradient-Based Meta-Learning as Hierarchical Bayes

Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks

Training wide residual networks for deployment using a single bit for each weight

Deep Learning and Quantum Entanglement: Fundamental Connections with Implications to Network Design

Unsupervised Learning of Goal Spaces for Intrinsically Motivated Goal Exploration

Noisy Networks For Exploration

Learning to Count Objects in Natural Images for Visual Question Answering

Learning Wasserstein Embeddings

Espresso: Efficient Forward Propagation for Binary Deep Neural Networks

Critical Percolation as a Framework to Analyze the Training of Deep Networks

Towards better understanding of gradient-based attribution methods for Deep Neural Networks

An Online Learning Approach to Generative Adversarial Networks

Progressive Reinforcement Learning with Distillation for Multi-Skilled Motion Control

Improving GANs Using Optimal Transport

Reinforcement Learning Algorithm Selection

Leveraging Grammar and Reinforcement Learning for Neural Program Synthesis

A Neural Representation of Sketch Drawings

Deep Rewiring: Training very sparse deep networks

SpectralNet: Spectral Clustering using Deep Neural Networks

A Bayesian Perspective on Generalization and Stochastic Gradient Descent

Mixed Precision Training

Adaptive Dropout with Rademacher Complexity Regularization

Robustness of Classifiers to Universal Perturbations: A Geometric Perspective

Detecting Statistical Interactions from Neural Network Weights

Deep Learning with Logged Bandit Feedback

Semantically Decomposing the Latent Spaces of Generative Adversarial Networks

Hyperparameter optimization: a spectral approach

Multi-Scale Dense Networks for Resource Efficient Image Classification

Unsupervised Cipher Cracking Using Discrete GANs

Minimax Curriculum Learning: Machine Teaching with Desirable Difficulties and Scheduled Diversity

Implicit Causal Models for Genome-wide Association Studies

TRAINING GENERATIVE ADVERSARIAL NETWORKS VIA PRIMAL-DUAL SUBGRADIENT METHODS: A LAGRANGIAN PERSPECTIVE ON GAN

Adaptive Quantization of Neural Networks

Multi-Task Learning for Document Ranking and Query Suggestion

Learning Deep Mean Field Games for Modeling Large Population Behavior

Residual Connections Encourage Iterative Inference

Auto-Conditioned Recurrent Networks for Extended Complex Human Motion Synthesis

Polar Transformer Networks

DORA The Explorer: Directed Outreaching Reinforcement Action-Selection

Emergent Complexity via Multi-Agent Competition

Learning from Between-class Examples for Deep Sound Recognition

DCN+: Mixed Objective And Deep Residual Coattention for Question Answering

Learning to Multi-Task by Active Sampling

Large scale distributed neural network training through online distillation

SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data

Multi-level Residual Networks from Dynamical Systems View

Gradient Estimators for Implicit Models

Coulomb GANs: Provably Optimal Nash Equilibria via Potential Fields

Learning Latent Representations in Neural Networks for Clustering through Pseudo Supervision and Graph-based Activity Regularization

Parallelizing Linear Recurrent Neural Nets Over Sequence Length

On the Discrimination-Generalization Tradeoff in GANs

Depthwise Separable Convolutions for Neural Machine Translation

FusionNet: Fusing via Fully-aware Attention with Application to Machine Comprehension

Emergence of grid-like representations by training recurrent neural networks to perform spatial localization

The High-Dimensional Geometry of Binary Neural Networks

Understanding Short-Horizon Bias in Stochastic Meta-Optimization

Deep Sensing: Active Sensing using Multi-directional Recurrent Neural Networks

Mitigating Adversarial Effects Through Randomization

The Implicit Bias of Gradient Descent on Separable Data

Alternating Multi-bit Quantization for Recurrent Neural Networks

Bi-Directional Block Self-Attention for Fast and Memory-Efficient Sequence Modeling

On the importance of single directions for generalization

AmbientGAN: Generative models from lossy measurements

An image representation based convolutional network for DNA classification

Parameter Space Noise for Exploration

Identifying Analogies Across Domains

Training Confidence-calibrated Classifiers for Detecting Out-of-Distribution Samples

A Scalable Laplace Approximation for Neural Networks

Minimal-Entropy Correlation Alignment for Unsupervised Deep Domain Adaptation

Learning Sparse Neural Networks through L_0 Regularization

Activation Maximization Generative Adversarial Nets

VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop

Emergent Translation in Multi-Agent Communication

cGANs with Projection Discriminator

Learning to Represent Programs with Graphs

Fix your classifier: the marginal value of training the last weight layer

On the State of the Art of Evaluation in Neural Language Models

Stabilizing Adversarial Nets with Prediction Methods

Emergent Communication in a Multi-Modal, Multi-Step Referential Game

Monotonic Chunkwise Attention

Emergent Communication through Negotiation

PixelNN: Example-based Image Synthesis

WRPN: Wide Reduced-Precision Networks

Self-ensembling for visual domain adaptation

Learning a neural response metric for retinal prosthesis

Loss-aware Weight Quantization of Deep Networks

Eigenoption Discovery through the Deep Successor Representation

Variational Continual Learning

TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning

Improving the Universality and Learnability of Neural Programmer-Interpreters with Combinator Abstraction

Neural Map: Structured Memory for Deep Reinforcement Learning

Learning to Teach

Backpropagation through the Void: Optimizing control variates for black-box gradient estimation

SEARNN: Training RNNs with global-local losses

TD or not TD: Analyzing the Role of Temporal Differencing in Deep Reinforcement Learning

Few-shot Autoregressive Density Estimation: Towards Learning to Learn Distributions

Deep Gaussian Embedding of Graphs: Unsupervised Inductive Learning via Ranking

Learning Differentially Private Recurrent Language Models

Guide Actor-Critic for Continuous Control

Unbiased Online Recurrent Optimization

Learning Latent Permutations with Gumbel-Sinkhorn Networks

Wasserstein Auto-Encoders

Many Paths to Equilibrium: GANs Do Not Need to Decrease a Divergence At Every Step

Large Scale Optimal Transport and Mapping Estimation

Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling

mixup: Beyond Empirical Risk Minimization

Mastering the Dungeon: Grounded Language Learning by Mechanical Turker Descent

Can Neural Networks Understand Logical Entailment?

Synthetic and Natural Noise Both Break Neural Machine Translation

Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments

Smooth Loss Functions for Deep Top-k Classification

Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs

Critical Points of Linear Neural Networks: Analytical Forms and Landscape Properties

Unsupervised Neural Machine Translation

Trust-PCL: An Off-Policy Trust Region Method for Continuous Control

PixelDefend: Leveraging Generative Models to Understand and Defend against Adversarial Examples

Stochastic Variational Video Prediction

Policy Optimization by Genetic Distillation

CausalGAN: Learning Causal Implicit Generative Models with Adversarial Training

Ensemble Adversarial Training: Attacks and Defenses

Learning Sparse Latent Representations with the Deep Copula Information Bottleneck

Understanding Deep Neural Networks with Rectified Linear Units

GANITE: Estimation of Individualized Treatment Effects using Generative Adversarial Nets

Boundary Seeking GANs

SCAN: Learning Hierarchical Compositional Visual Concepts

META LEARNING SHARED HIERARCHIES

Certifying Some Distributional Robustness with Principled Adversarial Training

Intrinsic Motivation and Automatic Curricula via Asymmetric Self-Play

Spherical CNNs

Generating Natural Adversarial Examples

Neural-Guided Deductive Search for Real-Time Program Synthesis from Examples

A Hierarchical Model for Device Placement

Emergence of Linguistic Communication from Referential Games with Symbolic and Pixel Input

Adversarial Dropout Regularization

Variational Message Passing with Structured Inference Networks

Temporally Efficient Deep Learning with Spikes

The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning

Variational Network Quantization

Latent Space Oddity: on the Curvature of Deep Generative Models

An efficient framework for learning sentence representations

Deep Autoencoding Gaussian Mixture Model for Unsupervised Anomaly Detection

Deep Active Learning for Named Entity Recognition

Neural Language Modeling by Jointly Learning Syntax and Lexicon

Predicting Floor-Level for 911 Calls with Neural Networks and Smartphone Sensor Data

Measuring the Intrinsic Dimension of Objective Landscapes

Thermometer Encoding: One Hot Way To Resist Adversarial Examples

Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning

Semantic Interpolation in Implicit Models

Certified Defenses against Adversarial Examples

Variance Reduction for Policy Gradient with Action-Dependent Factorized Baselines

Debiasing Evidence Approximations: On Importance-weighted Autoencoders and Jackknife Variational Inference

Defense-GAN: Protecting Classifiers Against Adversarial Attacks Using Generative Models

Model-Ensemble Trust-Region Policy Optimization

Initialization matters: Orthogonal Predictive State Recurrent Neural Networks

Ask the Right Questions: Active Question Reformulation with Reinforcement Learning

Multi-View Data Generation Without View Supervision

i-RevNet: Deep Invertible Networks

Divide-and-Conquer Reinforcement Learning

Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning

Evaluating the Robustness of Neural Networks: An Extreme Value Theory Approach

A Compressed Sensing View of Unsupervised Text Embeddings, Bag-of-n-Grams, and LSTMs

Multi-Mention Learning for Reading Comprehension with Neural Cascades

Fast and Accurate Reading Comprehension by Combining Self-Attention and Convolution

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks

Deep Learning as a Mixed Convex-Combinatorial Optimization Problem

Deep Neural Networks as Gaussian Processes

Global Optimality Conditions for Deep Neural Networks

Meta-Learning for Semi-Supervised Few-Shot Classification

Matrix capsules with EM routing

Simulating Action Dynamics with Neural Process Networks

Understanding image motion with group representations

Generalizing Across Domains via Cross-Gradient Training

Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting

Neural Speed Reading via Skim-RNN

On the Convergence of Adam and Beyond

HexaConv

Fidelity-Weighted Learning

Learning Approximate Inference Networks for Structured Prediction

Skip Connections Eliminate Singularities

Do GANs learn the distribution? Some Theory and Empirics

Breaking the Softmax Bottleneck: A High-Rank RNN Language Model

Dynamic Neural Program Embeddings for Program Repair

Deep Gradient Compression: Reducing the Communication Bandwidth for Distributed Training

Towards Deep Learning Models Resistant to Adversarial Attacks

Memorization Precedes Generation: Learning Unsupervised GANs with Memory Networks

Temporal Difference Models: Model-Free Deep RL for Model-Based Control

Towards Reverse-Engineering Black-Box Neural Networks

Online Learning Rate Adaptation with Hypergradient Descent

Variational Inference of Disentangled Latent Concepts from Unlabeled Observations

Unsupervised Representation Learning by Predicting Image Rotations

Generalizing Hamiltonian Monte Carlo with Neural Networks

FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling

Hierarchical Density Order Embeddings

Interactive Grounded Language Acquisition and Generalization in a 2D World

Generating Wikipedia by Summarizing Long Sequences

All-but-the-Top: Simple and Effective Postprocessing for Word Representations

Not-So-Random Features

Hierarchical and Interpretable Skill Acquisition in Multi-task Reinforcement Learning

Modular Continual Learning in a Unified Visual Environment

Stochastic gradient descent performs variational inference, converges to limit cycles for deep networks

The Role of Minimal Complexity Functions in Unsupervised Learning of Semantic Mappings

Boosting the Actor with Dual Critic

Neural Sketch Learning for Conditional Program Generation

MGAN: Training Generative Adversarial Nets with Multiple Generators

Towards Neural Phrase-based Machine Translation

Graph Attention Networks

Efficient Sparse-Winograd Convolutional Neural Networks

Compositional Obverter Communication Learning from Raw Visual Input

Enhancing The Reliability of Out-of-distribution Image Detection in Neural Networks

RESIDUAL LOSS PREDICTION: REINFORCEMENT LEARNING WITH NO INCREMENTAL FEEDBACK

Non-Autoregressive Neural Machine Translation

Overcoming Catastrophic Interference using Conceptor-Aided Backpropagation

On the Expressive Power of Overlapping Architectures of Deep Learning

Memory Architectures in Recurrent Neural Network Language Models

Sparse Persistent RNNs: Squeezing Large Recurrent Networks On-Chip

Interpretable Counting for Visual Question Answering

Countering Adversarial Images using Input Transformations

Learning One-hidden-layer Neural Networks with Landscape Design

Twin Networks: Matching the Future for Sequence Generation

Stochastic Activation Pruning for Robust Adversarial Defense

Viterbi-based Pruning for Sparse Matrix with Fixed and High Index Compression Ratio

Proximal Backpropagation

Learning Intrinsic Sparse Structures within Long Short-Term Memory

Variational image compression with a scale hyperprior

Distributed Prioritized Experience Replay

FearNet: Brain-Inspired Model for Incremental Learning

Don't Decay the Learning Rate, Increase the Batch Size

A Simple Neural Attentive Meta-Learner

Action-dependent Control Variates for Policy Optimization via Stein Identity

Generative Models of Visually Grounded Imagination

The Kanerva Machine: A Generative Distributed Memory

N2N learning: Network to Network Compression via Policy Gradient Reinforcement Learning

Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning

Wavelet Pooling for Convolutional Neural Networks

Progressive Growing of GANs for Improved Quality, Stability, and Variation

Active Learning for Convolutional Neural Networks: A Core-Set Approach

Meta-Learning and Universality: Deep Representations and Gradient Descent can Approximate any Learning Algorithm

Can recurrent neural networks warp time?

Gaussian Process Behaviour in Wide Deep Neural Networks

Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy

Learning From Noisy Singly-labeled Data

On Unifying Deep Generative Models

Compositional Attention Networks for Machine Reasoning

A Deep Reinforced Model for Abstractive Summarization

Expressive power of recurrent neural networks

Combining Symbolic Expressions and Black-box Function Evaluations in Neural Programs

Evidence Aggregation for Answer Re-Ranking in Open-Domain Question Answering

Deep Learning for Physical Processes: Incorporating Prior Scientific Knowledge

Zero-Shot Visual Imitation

Sobolev GAN

Learning Robust Rewards with Adverserial Inverse Reinforcement Learning

Learning a Generative Model for Validity in Complex Discrete Structures

Distributed Distributional Deterministic Policy Gradients

Learning an Embedding Space for Transferable Robot Skills

Neumann Optimizer: A Practical Optimization Algorithm for Deep Neural Networks

Decoupling the Layers in Residual Networks

Training GANs with Optimism