# Downloads 2019

Number of events: 520

- A2BCD: Asynchronous Acceleration with Optimal Complexity
- ACCELERATING NONCONVEX LEARNING VIA REPLICA EXCHANGE LANGEVIN DIFFUSION
- Accumulation Bit-Width Scaling For Ultra-Low Precision Training Of Deep Networks
- A Closer Look at Deep Learning Heuristics: Learning rate restarts, Warmup and Distillation
- A Closer Look at Few-shot Classification
- A comprehensive, application-oriented study of catastrophic forgetting in DNNs
- A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks
- Active Learning with Partial Feedback
- Adaptive Estimators Show Information Compression in Deep Neural Networks
- Adaptive Gradient Methods with Dynamic Bound of Learning Rate
- Adaptive Input Representations for Neural Language Modeling
- Adaptive Posterior Learning: few-shot learning with a surprise-based memory module
- Adaptivity of deep ReLU network for learning in Besov and mixed smooth Besov spaces: optimal rate and curse of dimensionality
- AdaShift: Decorrelation and Convergence of Adaptive Learning Rate Methods
- A Data-Driven and Distributed Approach to Sparse Signal Representation and Recovery
- ADef: an Iterative Algorithm to Construct Adversarial Deformations
- A Direct Approach to Robust Deep Learning Using Adversarial Networks
- AD-VAT: An Asymmetric Dueling mechanism for learning Visual Active Tracking
- Adv-BNN: Improved Adversarial Defense through Robust Bayesian Neural Network
- Adversarial Attacks on Graph Neural Networks via Meta Learning
- Adversarial Audio Synthesis
- Adversarial Domain Adaptation for Stable Brain-Machine Interfaces
- Adversarial Imitation via Variational Inverse Reinforcement Learning
- Adversarial Machine Learning
- Adversarial Reprogramming of Neural Networks
- A Generative Model For Electron Paths
- Aggregated Momentum: Stability Through Passive Damping
- AI for Social Good
- A Kernel Random Matrix-Based Approach for Sparse PCA
- Algorithmic Framework for Model-based Deep Reinforcement Learning with Theoretical Guarantees
- ALISTA: Analytic Weights Are As Good As Learned Weights in LISTA
- A Max-Affine Spline Perspective of Recurrent Neural Networks
- A Mean Field Theory of Batch Normalization
- Amortized Bayesian Meta-Learning
- Analysing Mathematical Reasoning Abilities of Neural Models
- Analysis of Quantized Models
- Analyzing Inverse Problems with Invertible Neural Networks
- An analytic theory of generalization dynamics and transfer learning in deep linear networks
- An Empirical study of Binary Neural Networks' Optimisation
- An Empirical Study of Example Forgetting during Deep Neural Network Learning
- A new dog learns old tricks: RL finds classic optimization algorithms
- AntisymmetricRNN: A Dynamical System View on Recurrent Neural Networks
- ANYTIME MINIBATCH: EXPLOITING STRAGGLERS IN ONLINE DISTRIBUTED OPTIMIZATION
- Approximability of Discriminators Implies Diversity in GANs
- Approximating CNNs with Bag-of-local-Features models works surprisingly well on ImageNet
- Are adversarial examples inevitable?
- ARM: Augment-REINFORCE-Merge Gradient for Stochastic Binary Networks
- A rotation-equivariant convolutional neural network model of primary visual cortex
- A Statistical Approach to Assessing Neural Network Robustness
- Attention, Learn to Solve Routing Problems!
- Attentive Neural Processes
- Augmented Cyclic Adversarial Learning for Low Resource Domain Adaptation
- A Unified Theory of Early Visual Representations from Retina to Cortex through Anatomically Constrained Deep CNNs
- A Universal Music Translation Network
- AutoLoss: Learning Discrete Schedule for Alternate Optimization
- Automatically Composing Representation Transformations as a Means for Generalization
- Auxiliary Variational MCMC
- A Variational Inequality Perspective on Generative Adversarial Networks
- BabyAI: A Platform to Study the Sample Efficiency of Grounded Language Learning
- Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity
- BA-Net: Dense Bundle Adjustment Networks
- Bayesian Deep Convolutional Networks with Many Channels are Gaussian Processes
- Bayesian Policy Optimization for Model Uncertainty
- Bayesian Prediction of Future Street Scenes using Synthetic Likelihoods
- Benchmarking Neural Network Robustness to Common Corruptions and Perturbations
- Beyond Greedy Ranking: Slate Optimization via List-CVAE
- Beyond Pixel Norm-Balls: Parametric Adversaries using an Analytically Differentiable Renderer
- Bias-Reduced Uncertainty Estimation for Deep Neural Classifiers
- Big-Little Net: An Efficient Multi-Scale Feature Representation for Visual and Speech Recognition
- Biologically-Plausible Learning Algorithms Can Scale to Large Datasets
- Boosting Robustness Certification of Neural Networks
- Bounce and Learn: Modeling Scene Dynamics with Real-World Bounces
- Building Dynamic Knowledge Graphs from Text using Machine Reading Comprehension
- CAMOU: Learning Physical Vehicle Camouflages to Adversarially Attack Detectors in the Wild
- Can Machine Learning Help to Conduct a Planetary Healthcheck?
- Capsule Graph Neural Network
- Caveats for information bottleneck in deterministic scenarios
- CBOW Is Not All You Need: Combining CBOW with the Compositional Matrix Space Model
- CEM-RL: Combining evolutionary and gradient-based methods for policy search
- Characterizing Audio Adversarial Examples Using Temporal Dependency
- ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech
- Coarse-grain Fine-grain Coattention Network for Multi-evidence Question Answering
- code2seq: Generating Sequences from Structured Representations of Code
- Combinatorial Attacks on Binarized Neural Networks
- Competitive experience replay
- Complement Objective Training
- Composing Complex Skills by Learning Transition Policies
- Conditional Network Embeddings
- Context-adaptive Entropy Model for End-to-end Optimized Image Compression
- Contingency-Aware Exploration in Reinforcement Learning
- Convolutional Neural Networks on Non-uniform Geometrical Signals Using Euclidean Spectral Transformation
- Cost-Sensitive Robustness against Adversarial Examples
- Critical Learning Periods in Deep Networks
- DARTS: Differentiable Architecture Search
- Data-Dependent Coresets for Compressing Neural Networks with Applications to Generalization Bounds
- Debugging Machine Learning Models
- Decoupled Weight Decay Regularization
- Deep Anomaly Detection with Outlier Exposure
- Deep Convolutional Networks as shallow Gaussian Processes
- Deep Decoder: Concise Image Representations from Untrained Non-convolutional Networks
- Deep Frank-Wolfe For Neural Network Optimization
- Deep Generative Models for Highly Structured Data
- Deep Graph Infomax
- Deep Lagrangian Networks: Using Physics as Model Prior for Deep Learning
- Deep Layers as Stochastic Solvers
- Deep Learning 3D Shapes Using Alt-az Anisotropic 2-Sphere Convolution
- Deep learning generalizes because the parameter-function map is biased towards simple functions
- DeepOBS: A Deep Learning Optimizer Benchmark Suite
- Deep Online Learning Via Meta-Learning: Continual Adaptation for Model-Based RL
- Deep Reinforcement Learning Meets Structured Prediction
- Deep reinforcement learning with relational inductive biases
- Deep, Skinny Neural Networks are not Universal Approximators
- Defensive Quantization: When Efficiency Meets Robustness
- DELTA: DEEP LEARNING TRANSFER USING FEATURE MAP WITH ATTENTION FOR CONVOLUTIONAL NETWORKS
- Detecting Egregious Responses in Neural Sequence-to-sequence Models
- Deterministic PAC-Bayesian generalization bounds for deep networks via generalizing noise-resilience
- Deterministic Variational Inference for Robust Bayesian Neural Networks
- Developmental Autonomous Learning: AI, Cognitive Sciences and Educational Technology
- DHER: Hindsight Experience Replay for Dynamic Goals
- Diagnosing and Enhancing VAE Models
- DialogWAE: Multimodal Response Generation with Conditional Wasserstein Auto-Encoder
- Differentiable Learning-to-Normalize via Switchable Normalization
- Differentiable Perturb-and-Parse: Semi-Supervised Parsing with a Structured Variational Autoencoder
- Diffusion Scattering Transforms on Graphs
- Dimensionality Reduction for Representing the Knowledge of Probabilistic Models
- Directed-Info GAIL: Learning Hierarchical Policies from Unsegmented Demonstrations using Directed Information
- Discovery of Natural Language Concepts in Individual Units of CNNs
- Discriminator-Actor-Critic: Addressing Sample Inefficiency and Reward Bias in Adversarial Imitation Learning
- Discriminator Rejection Sampling
- Disjoint Mapping Network for Cross-modal Matching of Voices and Faces
- DISTRIBUTIONAL CONCAVITY REGULARIZATION FOR GANS
- Distribution-Interpolation Trade off in Generative Models
- Diversity and Depth in Per-Example Routing Models
- Diversity is All You Need: Learning Skills without a Reward Function
- Diversity-Sensitive Conditional Generative Adversarial Networks
- Do Deep Generative Models Know What They Don't Know?
- DOM-Q-NET: Grounded RL on Structured Language
- Don't let your Discriminator be fooled
- Don't Settle for Average, Go for the Max: Fuzzy Sets and Max-Pooled Word Vectors
- Double Viterbi: Weight Encoding for High Compression Ratio and Fast On-Chip Reconstruction for Deep Neural Network
- Doubly Reparameterized Gradient Estimators for Monte Carlo Objectives
- DPSNet: End-to-end Deep Plane Sweep Stereo
- Dynamically Unfolding Recurrent Restorer: A Moving Endpoint Control Method for Image Restoration
- Dynamic Channel Pruning: Feature Boosting and Suppression
- Dynamic Sparse Graph for Efficient Deep Learning
- DyRep: Learning Representations over Dynamic Graphs
- Efficient Augmentation via Data Subsampling
- Efficient Lifelong Learning with A-GEM
- Efficiently testing local optimality and escaping saddles for ReLU networks
- Efficient Multi-Objective Neural Architecture Search via Lamarckian Evolution
- Efficient Training on Very Large Corpora via Gramian Estimation
- Eidetic 3D LSTM: A Model for Video Prediction and Beyond
- Emergent Coordination Through Competition
- Emerging Disentanglement in Auto-Encoder Based Unsupervised Image Content Transfer
- Enabling Factorized Piano Music Modeling and Generation with the MAESTRO Dataset
- Energy-Constrained Compression for Deep Neural Networks via Weighted Sparse Projection and Layer Input Masking
- Environment Probing Interaction Policies
- Episodic Curiosity through Reachability
- Equi-normalization of Neural Networks
- Evaluating Robustness of Neural Networks with Mixed Integer Programming
- Excessive Invariance Causes Adversarial Vulnerability
- Execution-Guided Neural Program Synthesis
- Exemplar Guided Unsupervised Image-to-Image Translation with Semantic Consistency
- Explaining Image Classifiers by Counterfactual Generation
- Exploration by random network distillation
- Feature Intertwiner for Object Detection
- Feature-Wise Bias Amplification
- Feed-forward Propagation in Probabilistic Neural Networks with Categorical and Max Layers
- FFJORD: Free-Form Continuous Dynamics for Scalable Reversible Generative Models
- Fixup Initialization: Residual Learning Without Normalization
- FlowQA: Grasping Flow in History for Conversational Machine Comprehension
- Fluctuation-dissipation relations for stochastic gradient descent
- From Hard to Soft: Understanding Deep Network Nonlinearities via Vector Quantization and Statistical Inference
- From Language to Goals: Inverse Reinforcement Learning for Vision-Based Instruction Following
- FUNCTIONAL VARIATIONAL BAYESIAN NEURAL NETWORKS
- Function Space Particle Optimization for Bayesian Neural Networks
- GamePad: A Learning Environment for Theorem Proving
- GAN Dissection: Visualizing and Understanding Generative Adversarial Networks
- GANSynth: Adversarial Neural Audio Synthesis
- Generalizable Adversarial Training via Spectral Normalization
- Generalized Tensor Models for Recurrent Neural Networks
- GENERATING HIGH FIDELITY IMAGES WITH SUBSCALE PIXEL NETWORKS AND MULTIDIMENSIONAL UPSCALING
- Generating Liquid Simulations with Deformation-aware Neural Networks
- Generating Multi-Agent Trajectories using Programmatic Weak Supervision
- Generating Multiple Objects at Spatially Distinct Locations
- Generative Code Modeling with Graphs
- Generative predecessor models for sample-efficient imitation learning
- Generative Question Answering: Learning to Answer the Whole Question
- Global-to-local Memory Pointer Networks for Task-Oriented Dialogue
- GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
- GO Gradient for Expectation-Based Objectives
- Gradient descent aligns the layers of deep linear networks
- Gradient Descent Provably Optimizes Over-parameterized Neural Networks
- Graph HyperNetworks for Neural Architecture Search
- Graph Wavelet Neural Network
- G-SGD: Optimizing ReLU Neural Networks in its Positively Scale-Invariant Space
- Guiding Policies with Language via Meta-Learning
- Harmonic Unpaired Image-to-image Translation
- Harmonizing Maximum Likelihood with GANs for Multimodal Conditional Generation
- h-detach: Modifying the LSTM Gradient Towards Better Optimization
- Hierarchical Generative Modeling for Controllable Speech Synthesis
- Hierarchical interpretations for neural network predictions
- Hierarchical Reinforcement Learning via Advantage-Weighted Information Maximization
- Hierarchical RL Using an Ensemble of Proprioceptive Periodic Policies
- Hierarchical Visuomotor Control of Humanoids
- Highlights of Recent Developments in Algorithmic Fairness
- Hindsight policy gradients
- How Important is a Neuron
- How Powerful are Graph Neural Networks?
- How to train your MAML
- Human-level Protein Localization with Convolutional Neural Networks
- Hyperbolic Attention Networks
- Identifying and Controlling Important Neurons in Neural Machine Translation
- ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness
- Imposing Category Trees Onto Word-Embeddings Using A Geometric Construction
- Improving Differentiable Neural Computers Through Memory Masking, De-allocation, and Link Distribution Sharpness Control
- Improving Generalization and Stability of Generative Adversarial Networks
- Improving MMD-GAN Training with Repulsive Loss Function
- Improving Sequence-to-Sequence Learning via Optimal Transport
- Improving the Generalization of Adversarial Training with Domain Adaptation
- InfoBot: Transfer and Exploration via the Information Bottleneck
- Information asymmetry in KL-regularized RL
- Information-Directed Exploration for Deep Reinforcement Learning
- Information Theoretic lower bounds on negative log likelihood
- Initialized Equilibrium Propagation for Backprop-Free Training
- InstaGAN: Instance-aware Image-to-Image Translation
- Integer Networks for Data Compression with Latent-Variable Models
- Interpolation-Prediction Networks for Irregularly Sampled Time Series
- Invariant and Equivariant Graph Networks
- INVASE: Instance-wise Variable Selection using Neural Networks
- Janossy Pooling: Learning Deep Permutation-Invariant Functions for Variable-Size Inputs
- Kernel Change-point Detection with Auxiliary Deep Generative Models
- Kernel RNN Learning (KeRNL)
- K for the Price of 1: Parameter-efficient Multi-task and Transfer Learning
- KnockoffGAN: Generating Knockoffs for Feature Selection using Generative Adversarial Networks
- Knowledge Flow: Improve Upon Your Teachers
- L2-Nonexpansive Neural Networks
- Label super-resolution networks
- Lagging Inference Networks and Posterior Collapse in Variational Autoencoders
- LanczosNet: Multi-Scale Deep Graph Convolutional Networks
- Large-Scale Answerer in Questioner's Mind for Visual Dialog Question Generation
- Large Scale GAN Training for High Fidelity Natural Image Synthesis
- Large Scale Graph Learning From Smooth Signals
- Large-Scale Study of Curiosity-Driven Learning
- Latent Convolutional Models
- LatinX in AI and Black in AI Joint Workshop
- LayoutGAN: Generating Graphic Layouts with Wireframe Discriminators
- Learnable Embedding Space for Efficient Neural Architecture Compression
- Learning Actionable Representations with Goal Conditioned Policies
- Learning a Meta-Solver for Syntax-Guided Program Synthesis
- Learning a SAT Solver from Single-Bit Supervision
- Learning-Based Frequency Estimation Algorithms
- Learning concise representations for regression by evolving networks of trees
- Learning deep representations by mutual information estimation and maximization
- Learning Embeddings into Entropic Wasserstein Spaces
- Learning Exploration Policies for Navigation
- Learning Factorized Multimodal Representations
- LEARNING FACTORIZED REPRESENTATIONS FOR OPEN-SET DOMAIN ADAPTATION
- Learning Finite State Representations of Recurrent Policy Networks
- Learning (from) language in context
- Learning from Positive and Unlabeled Data with a Selection Bias
- Learning Grid Cells as Vector Representation of Self-Position Coupled with Matrix Representation of Self-Motion
- Learning Implicitly Recurrent CNNs Through Parameter Sharing
- Learning Latent Superstructures in Variational Autoencoders for Deep Multidimensional Clustering
- Learning Localized Generative Models for 3D Point Clouds via Graph Convolution
- Learning Mixed-Curvature Representations in Product Spaces
- Learning Multi-Level Hierarchies with Hindsight
- Learning Multimodal Graph-to-Graph Translation for Molecule Optimization
- Learning Natural Language Interfaces with Neural Models
- Learning Neural PDE Solvers with Convergence Guarantees
- Learning Particle Dynamics for Manipulating Rigid Bodies, Deformable Objects, and Fluids
- Learning Procedural Abstractions and Evaluating Discrete Latent Temporal Structure
- Learning Programmatically Structured Representations with Perceptor Gradients
- Learning protein sequence embeddings using information from structure
- Learning Protein Structure with a Differentiable Simulator
- Learning Recurrent Binary/Ternary Weights
- Learning Representations of Sets through Optimized Permutations
- Learning Representations Using Causal Invariance
- Learning Robust Representations by Projecting Superficial Statistics Out
- Learning Self-Imitating Diverse Policies
- Learning sparse relational transition models
- Learning to Adapt in Dynamic, Real-World Environments through Meta-Reinforcement Learning
- Learning to Describe Scenes with Programs
- Learning to Design RNA
- Learning to Infer and Execute 3D Shape Programs
- Learning to Learn with Conditional Class Dependencies
- Learning to Learn without Forgetting by Maximizing Transfer and Minimizing Interference
- Learning to Make Analogies by Contrasting Abstract Relational Structure
- Learning to Navigate the Web
- LEARNING TO PROPAGATE LABELS: TRANSDUCTIVE PROPAGATION NETWORK FOR FEW-SHOT LEARNING
- Learning to Remember More with Less Memorization
- Learning to Represent Edits
- Learning to Schedule Communication in Multi-agent Reinforcement Learning
- Learning to Screen for Fast Softmax Inference on Large Vocabulary Neural Networks
- Learning To Simulate
- Learning To Solve Circuit-SAT: An Unsupervised Differentiable Approach
- Learning to Understand Goal Specifications by Modelling Reward
- Learning Two-layer Neural Networks with Symmetric Inputs
- Learning what and where to attend
- Learning what you can do before doing anything
- Learning when to Communicate at Scale in Multiagent Cooperative and Competitive Tasks
- LeMoNADe: Learned Motif and Neuronal Assembly Detection in calcium imaging videos
- Local SGD Converges Fast and Communicates Little
- L-Shapley and C-Shapley: Efficient Model Interpretation for Structured Data
- M^3RL: Mind-aware Multi-agent Management Reinforcement Learning
- MAE: Mutual Posterior-Divergence Regularization for Variational AutoEncoders
- MARGINALIZED AVERAGE ATTENTIONAL NETWORK FOR WEAKLY-SUPERVISED LEARNING
- Marginal Policy Gradients: A Unified Family of Estimators for Bounded Action Spaces with Applications
- Maximal Divergence Sequential Autoencoder for Binary Software Vulnerability Detection
- Max-MIG: an Information Theoretic Approach for Joint Learning from Crowds
- Measuring and regularizing networks in function space
- Measuring Compositionality in Representation Learning
- Meta-Learning For Stochastic Gradient MCMC
- Meta-Learning Probabilistic Inference for Prediction
- Meta-Learning Update Rules for Unsupervised Representation Learning
- Meta-learning with differentiable closed-form solvers
- Meta-Learning with Latent Embedding Optimization
- Minimal Images in Deep Neural Networks: Fragile Object Recognition in Natural Images
- Minimal Random Code Learning: Getting Bits Back from Compressed Model Parameters
- Minimum Divergence vs. Maximum Margin: an Empirical Comparison on Seq2Seq Models
- MisGAN: Learning from Incomplete Data with Generative Adversarial Networks
- Modeling the Long Term Future in Model-Based Reinforcement Learning
- Modeling Uncertainty with Hedged Instance Embeddings
- Model-Predictive Policy Learning with Uncertainty Regularization for Driving in Dense Traffic
- Mode Normalization
- Multi-Agent Dual Learning
- Multi-class classification without multi-class labels
- Multi-Domain Adversarial Learning
- Multilingual Neural Machine Translation with Knowledge Distillation
- Multilingual Neural Machine Translation With Soft Decoupled Encoding
- Multiple-Attribute Text Rewriting
- Multi-step Retriever-Reader Interaction for Scalable Open-domain Question Answering
- Music Transformer: Generating Music with Long-Term Structure
- NADPEx: An on-policy temporally consistent exploration method for deep reinforcement learning
- Near-Optimal Representation Learning for Hierarchical Reinforcement Learning
- Neural Graph Evolution: Automatic Robot Design
- Neural Logic Machines
- Neural network gradient-based learning of black-box function interfaces
- Neural Persistence: A Complexity Measure for Deep Neural Networks Using Algebraic Topology
- Neural Probabilistic Motor Primitives for Humanoid Control
- Neural Program Repair by Jointly Learning to Localize and Repair
- Neural Speed Reading with Structural-Jump-LSTM
- Neural TTS Stylization with Adversarial and Collaborative Games
- Non-vacuous Generalization Bounds at the ImageNet Scale: a PAC-Bayesian Compression Approach
- NOODL: Provable Online Dictionary Learning and Sparse Coding
- No Training Required: Exploring Random Encoders for Sentence Classification
- Off-Policy Evaluation and Learning from Logged Bandit Feedback: Error Reduction via Surrogate Policy
- On Computation and Generalization of Generative Adversarial Networks under Spectrum Control
- On Random Deep Weight-Tied Autoencoders: Exact Asymptotic Analysis, Phase Transitions, and Implications to Training
- On Self Modulation for Generative Adversarial Networks
- On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization
- On the loss landscape of a class of deep neural networks with no bad local valleys
- On the Minimal Supervision for Training Any Binary Classifier from Only Unlabeled Data
- On the Relation Between the Sharpest Directions of DNN Loss and the SGD Step Length
- On the Sensitivity of Adversarial Robustness to Input Data Distributions
- On the Turing Completeness of Modern Neural Network Architectures
- On the Universal Approximability and Complexity Bounds of Quantized ReLU Neural Networks
- Opportunistic Learning: Budgeted Cost-Sensitive Learning from Data Streams
- Optimal Completion Distillation for Sequence Learning
- Optimal Control Via Neural Networks: A Convex Approach
- Optimal Transport Maps For Distribution Preserving Operations on Latent Spaces of Generative Models
- Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile
- Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks
- Overcoming Catastrophic Forgetting for Continual Learning via Model Adaptation
- Overcoming the Disentanglement vs Reconstruction Trade-off via Jacobian Supervision
- PATE-GAN: Generating Synthetic Data with Differential Privacy Guarantees
- Pay Less Attention with Lightweight and Dynamic Convolutions
- PeerNets: Exploiting Peer Wisdom Against Adversarial Attacks
- Per-Tensor Fixed-Point Quantization of the Back-Propagation Algorithm
- Plan Online, Learn Offline: Efficient Learning and Exploration via Model-Based Control
- Poincare Glove: Hyperbolic Word Embeddings
- Policy Transfer with Strategy Optimization
- Posterior Attention Models for Sequence to Sequence Learning
- Post Selection Inference with Incomplete Maximum Mean Discrepancy Estimator
- Practical lossless compression with latent variables using bits back coding
- Preconditioner on Matrix Lie Group for SGD
- Predicting the Generalization Gap in Deep Networks with Margin Distributions
- Predict then Propagate: Graph Neural Networks meet Personalized PageRank
- Preferences Implicit in the State of the World
- Preventing Posterior Collapse with delta-VAEs
- Prior Convictions: Black-box Adversarial Attacks with Bandits and Priors
- Probabilistic Planning with Sequential Monte Carlo methods
- Probabilistic Recursive Reasoning for Multi-Agent Reinforcement Learning
- ProbGAN: Towards Probabilistic GAN with Theoretical Guarantees
- ProMP: Proximal Meta-Policy Search
- ProxQuant: Quantized Neural Networks via Proximal Operators
- ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
- Quasi-hyperbolic momentum and Adam for deep learning
- Quaternion Recurrent Neural Networks
- Query-Efficient Hard-label Black-box Attack: An Optimization-based Approach
- Random mesh projectors for inverse problems
- Reasoning About Physical Interactions with Object-Oriented Prediction and Planning
- Recall Traces: Backtracking Models for Efficient Reinforcement Learning
- Recurrent Experience Replay in Distributed Reinforcement Learning
- Regularized Learning for Domain Adaptation under Label Shifts
- Relational Forward Models for Multi-Agent Learning
- Relaxed Quantization for Discretized Neural Networks
- RelGAN: Relational Generative Adversarial Networks for Text Generation
- Representation Degeneration Problem in Training Natural Language Generation Models
- Representation Learning on Graphs and Manifolds
- Representing Formal Languages: A Comparison Between Finite Automata and Recurrent Neural Networks
- Reproducibility in Machine Learning
- Residual Non-local Attention Networks for Image Restoration
- Rethinking the Value of Network Pruning
- Revealing interpretable object representations from human behavior
- Reward Constrained Policy Optimization
- Riemannian Adaptive Optimization Methods
- Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures
- RNNs implicitly implement tensor-product representations
- Robust Conditional Generative Adversarial Networks
- ROBUST ESTIMATION VIA GENERATIVE ADVERSARIAL NETWORKS
- Robustness May Be at Odds with Accuracy
- RotatE: Knowledge Graph Embedding by Relational Rotation in Complex Space
- RotDCF: Decomposition of Convolutional Filters for Rotation-Equivariant Deep Networks
- Safe Machine Learning: Specification, Robustness, and Assurance
- Sample Efficient Adaptive Text-to-Speech
- Sample Efficient Imitation Learning for Continuous Control
- Scalable Unbalanced Optimal Transport using Generative Adversarial Networks
- Selfless Sequential Learning
- Self-Monitoring Navigation Agent via Auxiliary Progress Estimation
- Self-Tuning Networks: Bilevel Optimization of Hyperparameters using Structured Best-Response Functions
- SGD Converges to Global Minimum in Deep Learning via Star-convex Path
- signSGD via Zeroth-Order Oracle
- signSGD with Majority Vote is Communication Efficient and Fault Tolerant
- Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware
- Sliced Wasserstein Auto-Encoders
- Slimmable Neural Networks
- Small nonlinearities in activation functions create bad local minima in neural networks
- Smoothing the Geometry of Probabilistic Box Embeddings
- SNAS: stochastic neural architecture search
- SNIP: SINGLE-SHOT NETWORK PRUNING BASED ON CONNECTION SENSITIVITY
- Soft Q-Learning with Mutual-Information Regularization
- Solving the Rubik's Cube with Approximate Policy Iteration
- SOM-VAE: Interpretable Discrete Representation Learning on Time Series
- Sparse Dictionary Learning by Dynamical Neural Networks
- Spectral Inference Networks: Unifying Deep and Spectral Learning
- Spherical CNNs on Unstructured Grids
- SPIGAN: Privileged Adversarial Learning from Simulation
- Spreading vectors for similarity search
- Stable Opponent Shaping in Differentiable Games
- Stable Recurrent Models
- STCN: Stochastic Temporal Convolutional Networks
- Stochastic Gradient/Mirror Descent: Minimax Optimality and Implicit Regularization
- Stochastic Optimization of Sorting Networks via Continuous Relaxations
- Stochastic Prediction of Multi-Agent Interactions from Partial Observations
- StrokeNet: A Neural Painting Environment
- Structured Adversarial Attack: Towards General Implementation and Better Interpretability
- Structured Neural Summarization
- Structure & Priors in Reinforcement Learning (SPiRL)
- Subgradient Descent Learns Orthogonal Dictionaries
- Supervised Community Detection with Line Graph Neural Networks
- Supervised Policy Update for Deep Reinforcement Learning
- Synthetic Datasets for Neural Program Synthesis
- Systematic Generalization: What Is Required and Can It Be Learned?
- Task-Agnostic Reinforcement Learning (TARL)
- Temporal Difference Variational Auto-Encoder
- textTOvec: DEEP CONTEXTUALIZED NEURAL AUTOREGRESSIVE TOPIC MODELS OF LANGUAGE WITH DISTRIBUTED COMPOSITIONAL PRIOR
- The 2nd Learning from Limited Labeled Data (LLD) Workshop: Representation Learning for Weak Supervision and Beyond
- The Comparative Power of ReLU Networks and Polynomial Kernels in the Presence of Sparse Latent Structure
- The Deep Weight Prior
- The Laplacian in RL: Learning Representations with Efficient Approximations
- The Limitations of Adversarial Training and the Blind-Spot Attack
- The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
- The Neuro-Symbolic Concept Learner: Interpreting Scenes, Words, and Sentences From Natural Supervision
- Theoretical Analysis of Auto Rate-Tuning by Batch Normalization
- There Are Many Consistent Explanations of Unlabeled Data: Why You Should Average
- The relativistic discriminator: a key element missing from standard GAN
- The role of over-parametrization in generalization of neural networks
- The Singular Values of Convolutional Layers
- The Unusual Effectiveness of Averaging in GAN Training
- Three Mechanisms of Weight Decay Regularization
- TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer
- Time-Agnostic Prediction: Predicting Predictable Video Frames
- Top-Down Neural Model For Formulae
- Towards GAN Benchmarks Which Require Generalization
- Towards Metamerism via Foveated Style Transfer
- Towards Robust, Locally Linear Deep Networks
- Towards the first adversarially robust neural network model on MNIST
- Towards Understanding Regularization in Batch Normalization
- Toward Understanding the Impact of Staleness in Distributed Machine Learning
- Training for Faster Adversarial Robustness Verification via Inducing ReLU Stability
- Transfer Learning for Sequences via Learning to Collocate
- Transferring Knowledge across Learning Processes
- Tree-Structured Recurrent Switching Linear Dynamical Systems for Multi-Scale Modeling
- Trellis Networks for Sequence Modeling
- Two-Timescale Networks for Nonlinear Value Function Approximation
- Understanding and Improving Interpolation in Autoencoders via an Adversarial Regularizer
- Understanding Composition of Word Embeddings via Tensor Decomposition
- Understanding Straight-Through Estimator in Training Activation Quantized Neural Nets
- Universal Stagewise Learning for Non-Convex Problems with Convergence on Averaged Solutions
- Universal Successor Features Approximators
- Universal Transformers
- Unsupervised Adversarial Image Reconstruction
- Unsupervised Control Through Non-Parametric Discriminative Rewards
- Unsupervised Discovery of Parts, Structure, and Dynamics
- Unsupervised Domain Adaptation for Distance Metric Learning
- Unsupervised Hyper-alignment for Multilingual Word Embeddings
- Unsupervised Learning of the Set of Local Maxima
- Unsupervised Learning via Meta-Learning
- Unsupervised Speech Recognition via Segmental Empirical Output Distribution Matching
- Value Propagation Networks
- Variance Networks: When Expectation Does Not Meet Your Expectations
- Variance Reduction for Reinforcement Learning in Input-Driven Environments
- Variational Autoencoders with Jointly Optimized Latent Dependency Structure
- Variational Autoencoder with Arbitrary Conditioning
- Variational Bayesian Phylogenetic Inference
- Variational Discriminator Bottleneck: Improving Imitation Learning, Inverse RL, and GANs by Constraining Information Flow
- Variational Smoothing in Recurrent Neural Network Language Models
- Verification of Non-Linear Specifications for Neural Networks
- Visceral Machines: Risk-Aversion in Reinforcement Learning with Intrinsic Physiological Rewards
- Visual Explanation by Interpretation: Improving Visual Feedback Capabilities of Deep Neural Networks
- Visual Reasoning by Progressive Module Networks
- Visual Semantic Navigation using Scene Priors
- Von Mises-Fisher Loss for Training Sequence to Sequence Models with Continuous Outputs
- Wasserstein Barycenter Model Ensembling
- What do you learn from context? Probing for sentence structure in contextualized word representations
- While We're All Worried about Failures of Machine Learning, What Dangers Lurk If It (Mostly) Works?
- Whitening and Coloring Batch Transform for GANs
- Wizard of Wikipedia: Knowledge-Powered Conversational Agents
- Woulda, Coulda, Shoulda: Counterfactually-Guided Policy Search