Downloads 2021

Format:
Posters:
Tutorials:
Invited talks:
Workshops:
Demonstrations:

Number of events: 896

$i$-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning
2nd Workshop on Practical ML for Developing Countries: Learning Under Limited/low Resource Scenarios
A Better Alternative to Error Feedback for Communication-Efficient Distributed Learning
A Block Minifloat Representation for Training Deep Neural Networks
Accelerating Convergence of Replica Exchange Stochastic Gradient MCMC via Variance Reduction
Accurate Learning of Graph Representations with Graph Multiset Pooling
Achieving Linear Speedup with Partial Worker Participation in Non-IID Federated Learning
A Critique of Self-Expressive Deep Subspace Clustering
Acting in Delayed Environments with Non-Stationary Markov Policies
Activation-level uncertainty in deep neural networks
Active Contrastive Learning of Audio-Visual Video Representations
AdaFuse: Adaptive Temporal Fusion Network for Efficient Action Recognition
AdaGCN: Adaboosting Graph Convolutional Networks into Deep Models
AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights
Adapting to Reward Progressivity via Spectral Reinforcement Learning
Adaptive and Generative Zero-Shot Learning
Adaptive Extra-Gradient Methods for Min-Max Optimization and Games
Adaptive Federated Optimization
Adaptive Procedural Task Generation for Hard-Exploration Problems
Adaptive Universal Generalized PageRank Graph Neural Network
AdaSpeech: Adaptive Text to Speech for Custom Voice
A Design Space Study for LISTA and Beyond
A Diffusion Theory For Deep Learning Dynamics: Stochastic Gradient Descent Exponentially Favors Flat Minima
A Discriminative Gaussian Mixture Model with Sparsity
A Distributional Approach to Controlled Text Generation
Adversarially Guided Actor-Critic
Adversarially-Trained Deep Nets Transfer Better: Illustration on Image Classification
Adversarial score matching and improved sampling for image generation
A Geometric Analysis of Deep Generative Image Models and Its Applications
A Good Image Generator Is What You Need for High-Resolution Video Synthesis
A Gradient Flow Framework For Analyzing Network Pruning
A Hypergradient Approach to Robust Regression without Correspondence
AI for Public Health
AI in Finance: Scope and Examples
AIMOCC -- AI: Modeling Oceans and Climate Change
A Learning Theoretic Perspective on Local Explainability
ALFWorld: Aligning Text and Embodied Environments for Interactive Learning
Aligning AI With Shared Human Values
A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks
Analyzing the Expressive Power of Graph Neural Networks in a Spectral Perspective
Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics
Anchor & Transform: Learning Sparse Embeddings for Large Vocabularies
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
ANOCE: Analysis of Causal Effects with Multiple Mediators via Constrained Structural Learning
Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval
An Unsupervised Deep Learning Approach for Real-World Image Denoising
Anytime Sampling for Autoregressive Models via Ordered Autoencoding
A PAC-Bayesian Approach to Generalization Bounds for Graph Neural Networks
A Panda? No, It's a Sloth: Slowdown Attacks on Adaptive Multi-Exit Neural Network Inference
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval
Are Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight Masks
Are Neural Rankers still Outperformed by Gradient Boosted Decision Trees?
Are wider nets better given the same number of parameters?
ARMOURED: Adversarially Robust MOdels using Unlabeled data by REgularizing Diversity
A Roadmap to Never-Ending RL
Ask Your Humans: Using Human Instructions to Improve Generalization in Reinforcement Learning
A statistical theory of cold posteriors in deep neural networks
Async-RED: A Provably Convergent Asynchronous Block Parallel Stochastic Method using Deep Denoising Priors
A teacher-student framework to distill future trajectories
A Temporal Kernel Approach for Deep Learning with Continuous-time Information
A Trainable Optimal Transport Embedding for Feature Aggregation and its Relationship to Attention
Attentional Constellation Nets for Few-Shot Learning
Auction Learning as a Two-Player Game
Augmenting Physical Models with Deep Networks for Complex Dynamics Forecasting
A Unified Approach to Interpreting and Boosting Adversarial Transferability
A unifying view on implicit bias in training linear neural networks
A Universal Representation Transformer Layer for Few-Shot Image Classification
AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly
Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization
Autoregressive Entity Retrieval
Auto Seg-Loss: Searching Metric Surrogates for Semantic Segmentation
Auxiliary Learning by Implicit Differentiation
Auxiliary Task Update Decomposition: The Good, the Bad and the Neutral
Average-case Acceleration for Bilinear Games and Normal Matrices
A Wigner-Eckart Theorem for Group Equivariant Convolution Kernels
Bag of Tricks for Adversarial Training
Balancing Constraints and Rewards with Meta-Gradient D4PG
Batch Reinforcement Learning Through Continuation Method
Bayesian Context Aggregation for Neural Processes
Bayesian Few-Shot Classification with One-vs-Each Pólya-Gamma Augmented Gaussian Processes
Behavioral Cloning from Noisy Demonstrations
Benchmarks for Deep Off-Policy Evaluation
Benefit of deep learning with non-convex noisy gradient descent: Provable excess risk bound and superiority to kernel methods
BERTology Meets Biology: Interpreting Attention in Protein Language Models
Better Fine-Tuning by Reducing Representational Collapse
Beyond Categorical Label Representations for Image Classification
Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with $1/n$ Parameters
Beyond Static Papers: Rethinking How We Share Scientific Understanding in ML
Bidirectional Variational Inference for Non-Autoregressive Text-to-Speech
BiPointNet: Binary Neural Network for Point Clouds
Blending MPC & Value Function Approximation for Efficient Reinforcement Learning
BOIL: Towards Representation Change for Few-shot Learning
Boost then Convolve: Gradient Boosting Meets Graph Neural Networks
Bowtie Networks: Generative Modeling for Joint Few-Shot Recognition and Novel-View Synthesis
BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction
BREEDS: Benchmarks for Subpopulation Shift
BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization
BUSTLE: Bottom-Up Program Synthesis Through Learning-Guided Exploration
Bypassing the Ambient Dimension: Private SGD with Gradient Subspace Identification
Byzantine-Resilient Non-Convex Stochastic Gradient Descent
Calibration of Neural Networks using Splines
Calibration tests beyond classification
Can a Fruit Fly Learn Word Embeddings?
CaPC Learning: Confidential and Private Collaborative Learning
Capturing Label Characteristics in VAEs
Categorical Normalizing Flows via Continuous Transformations
CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning
CcGAN: Continuous Conditional Generative Adversarial Networks for Image Generation
Certify or Predict: Boosting Certified Robustness with Compositional Architectures
Chaos of Learning Beyond Zero-sum and Coordination via Game Decompositions
Characterizing signal propagation to close the performance gap in unnormalized ResNets
ChipNet: Budget-Aware Pruning with Heaviside Continuous Approximations
Clairvoyance: A Pipeline Toolkit for Medical Time Series
Class Normalization for (Continual)? Generalized Zero-Shot Learning
C-Learning: Horizon-Aware Cumulative Accessibility Estimation
C-Learning: Learning to Achieve Goals via Recursive Classification
Clustering-friendly Representation Learning via Instance Discrimination and Feature Decorrelation
CO2: Consistent Contrast for Unsupervised Visual Representation Learning
CoCo: Controllable Counterfactuals for Evaluating Dialogue State Trackers
CoCon: A Self-Supervised Approach for Controlled Text Generation
CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for Natural Language Understanding
Collective Robustness Certificates: Exploiting Interdependence in Graph Neural Networks
Colorization Transformer
Combining Ensembles and Data Augmentation Can Harm Your Calibration
Combining Label Propagation and Simple Models out-performs Graph Neural Networks
Combining Physics and Machine Learning for Network Flow Estimation
Co-Mixup: Saliency Guided Joint Mixup with Supermodular Diversity
Commonsense AI: Myth and Truth
Communication in Multi-Agent Reinforcement Learning: Intention Sharing
Complex Query Answering with Neural Link Predictors
CompOFA – Compound Once-For-All Networks for Faster Multi-Platform Deployment
Computational Separation Between Convolutional and Fully-Connected Networks
Concept Learners for Few-Shot Learning
Conditional Generative Modeling via Learning the Latent Space
Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data
Conditional Negative Sampling for Contrastive Learning of Visual Representations
Conformation-Guided Molecular Representation with Hamiltonian Neural Networks
Conservative Safety Critics for Exploration
Contemplating Real-World Object Classification
Contextual Dropout: An Efficient Sample-Dependent Dropout Module
Contextual Transformation Networks for Online Continual Learning
Continual learning in recurrent neural networks
Continuous Wasserstein-2 Barycenter Estimation without Minimax Optimization
Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning
Contrastive Divergence Learning is a Time Reversal Adversarial Game
Contrastive Explanations for Reinforcement Learning via Embedded Self Predictions
Contrastive Learning with Adversarial Perturbations for Conditional Text Generation
Contrastive Learning with Hard Negative Samples
Contrastive Syn-to-Real Generalization
Control-Aware Representations for Model-based Reinforcement Learning
Convex Potential Flows: Universal Probability Distributions with Optimal Transport and Convex Optimization
Convex Regularization behind Neural Reconstruction
Coping with Label Shift via Distributionally Robust Optimisation
CopulaGNN: Towards Integrating Representational and Correlational Roles of Graphs in Graph Neural Networks
Correcting experience replay for multi-agent communication
Counterfactual Generative Networks
Coupled Oscillatory Recurrent Neural Network (coRNN): An accurate and (gradient) stable architecture for learning long time dependencies
CPR: Classifier-Projection Regularization for Continual Learning
CPT: Efficient Deep Neural Network Training via Cyclic Precision
Creative Sketch Generation
Cross-Attentional Audio-Visual Fusion for Weakly-Supervised Action Localization
CT-Net: Channel Tensorization Network for Video Classification
Cut out the annotator, keep the cutout: better segmentation with weak supervision
Dance Revolution: Long-Term Dance Generation with Music via Curriculum Learning
DARTS-: Robustly Stepping out of Performance Collapse Without Indicators
Data-Efficient Reinforcement Learning with Self-Predictive Representations
Dataset Condensation with Gradient Matching
Dataset Inference: Ownership Resolution in Machine Learning
Dataset Meta-Learning from Kernel Ridge-Regression
DC3: A learning method for optimization with hard constraints
DDPNOpt: Differential Dynamic Programming Neural Optimizer
Deberta: Decoding-Enhanced Bert With Disentangled Attention
Debiasing Concept-based Explanations with Causal Analysis
Decentralized Attribution of Generative Models
Deciphering and Optimizing Multi-Task Learning: a Random Matrix Approach
Deconstructing the Regularization of BatchNorm
Decoupling Global and Local Representations via Invertible Generative Flows
DeepAveragers: Offline Reinforcement Learning By Solving Derived Non-Parametric MDPs
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation
Deep Equals Shallow for ReLU Networks in Kernel Regimes
Deep Learning for Simulation
Deep Learning meets Projective Clustering
Deep Networks and the Multiple Manifold Problem
Deep Neural Network Fingerprinting by Conferrable Adversarial Examples
Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS
Deep Partition Aggregation: Provable Defenses against General Poisoning Attacks
Deep Repulsive Clustering of Ordered Data Based on Order-Identity Decomposition
Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients
Deformable DETR: Deformable Transformers for End-to-End Object Detection
Degree-Quant: Quantization-Aware Training for Graph Neural Networks
DeLighT: Deep and Light-weight Transformer
Denoising Diffusion Implicit Models
Deployment-Efficient Reinforcement Learning via Model-Based Offline Optimization
DialoGraph: Incorporating Interpretable Strategy-Graph Networks into Negotiation Dialogues
DICE: Diversity in Deep Ensembles via Conditional Redundancy Adversarial Estimation
Differentiable Segmentation of Sequences
Differentiable Trust Region Layers for Deep Reinforcement Learning
Differentially Private Learning Needs Better Features (or Much More Data)
DiffWave: A Versatile Diffusion Model for Audio Synthesis
DINO: A Conditional Energy-Based GAN for Domain Translation
Directed Acyclic Graph Neural Networks
Direction Matters: On the Implicit Bias of Stochastic Gradient Descent with Moderate Learning Rate
Disambiguating Symbolic Expressions in Informal Documents
Discovering a set of policies for the worst case reward
Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization
Discovering Non-monotonic Autoregressive Orderings with Variational Inference
Discrete Graph Structure Learning for Forecasting Multiple Time Series
Disentangled Recurrent Wasserstein Autoencoder
Disentangling 3D Prototypical Networks for Few-Shot Concept Learning
Distance-Based Regularisation of Deep Networks for Fine-Tuning
Distilling Knowledge from Reader to Retriever for Question Answering
Distributed Momentum for Byzantine-resilient Stochastic Gradient Descent
Distributional Sliced-Wasserstein and Applications to Generative Modeling
Diverse Video Generation using a Gaussian Process Trigger
Do 2D GANs Know 3D Shape? Unsupervised 3D Shape Reconstruction from 2D Image GANs
Does enhanced shape bias improve neural network robustness to common corruptions?
Domain Generalization with MixStyle
Domain-Robust Visual Imitation Learning with Mutual Information Constraints
Do not Let Privacy Overbill Utility: Gradient Embedding Perturbation for Private Learning
DOP: Off-Policy Multi-Agent Decomposed Policy Gradients
Do Wide and Deep Networks Learn the Same Things? Uncovering How Neural Network Representations Vary with Width and Depth
DrNAS: Dirichlet Neural Architecture Search
Drop-Bottleneck: Learning Discrete Compressed Representation for Noise-Robust Exploration
Dual-mode ASR: Unify and Improve Streaming ASR with Full-context Modeling
Dynamic Tensor Rematerialization
DynaTune: Dynamic Tensor Program Optimization in Deep Neural Network Compilation
Early Stopping in Deep Networks: Double Descent and How to Eliminate it
Economic Hyperparameter Optimization With Blended Search Strategy
EEC: Learning to Encode and Regenerate Images for Continual Learning
Effective Abstract Reasoning with Dual-Contrast Network
Effective and Efficient Vote Attack on Capsule Networks
Effective Distributed Learning with Random Features: Improved Bounds and Algorithms
Efficient Certified Defenses Against Patch Attacks on Image Classifiers
Efficient Conformal Prediction via Cascaded Inference with Expanded Admission
Efficient Continual Learning with Modular Networks and Task-Driven Priors
Efficient Empowerment Estimation for Unsupervised Stabilization
Efficient Generalized Spherical CNNs
Efficient Inference of Flexible Interaction in Spiking-neuron Networks
Efficient Reinforcement Learning in Factored MDPs with Application to Constrained RL
Efficient Transformers in Reinforcement Learning using Actor-Learner Distillation
Efficient Wasserstein Natural Gradients for Reinforcement Learning
EigenGame: PCA as a Nash Equilibrium
Emergent Road Rules In Multi-Agent Driving Environments
Emergent Symbols through Binding in External Memory
Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition
Empirical or Invariant Risk Minimization? A Sample Complexity Perspective
End-to-end Adversarial Text-to-Speech
End-to-End Egospheric Spatial Memory
Energy-Based Models: Current Perspectives, Challenges, and Opportunities
Enforcing robust control guarantees within neural network policies
Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space Navigation
Entropic gradient descent algorithms and wide flat minima
Estimating and Evaluating Regression Predictive Uncertainty in Deep Object Detectors
Estimating informativeness of samples with Smooth Unique Information
Estimating Lipschitz constants of monotone deep equilibrium models
Evaluating the Disentanglement of Deep Generative Models through Manifold Topology
Evaluation of Neural Architectures Trained With Square Loss vs Cross-Entropy in Classification Tasks
Evaluation of Similarity-based Explanations
Evaluations and Methods for Explanation through Robustness Analysis
Evolving Reinforcement Learning Algorithms
Exemplary Natural Images Explain CNN Activations Better than State-of-the-Art Feature Visualization
Explainable Deep One-Class Classification
Explainable Subgraph Reasoning for Forecasting on Temporal Knowledge Graphs
Explaining by Imitating: Understanding Decisions by Interpretable Policy Learning
Explaining the Efficacy of Counterfactually Augmented Data
Exploring Balanced Feature Spaces for Representation Learning
Exploring the Uncertainty Properties of Neural Networks’ Implicit Priors in the Infinite-Width Limit
Expressive Power of Invariant and Equivariant Graph Neural Networks
Extracting Strong Policies for Robotics Tasks from Zero-Order Trajectory Optimizers
Extreme Memorization via Scale of Initialization
Factorizing Declarative and Procedural Knowledge in Structured, Dynamical Environments
FairBatch: Batch Selection for Model Fairness
FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders
Fair Mixup: Fairness via Interpolation
Fantastic Four: Differentiable and Efficient Bounds on Singular Values of Convolution Layers
Fast and Complete: Enabling Complete Neural Network Verification with Rapid and Massively Parallel Incomplete Verifiers
Fast And Slow Learning Of Recurrent Independent Mechanisms
Fast convergence of stochastic subgradient method under interpolation
Faster Binary Embeddings for Preserving Euclidean Distances
Fast Geometric Projections for Local Robustness Certification
FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
FedBE: Making Bayesian Model Ensemble Applicable to Federated Learning
FedBN: Federated Learning on Non-IID Features via Local Batch Normalization
Federated Learning Based on Dynamic Regularization
Federated Learning via Posterior Averaging: A New Perspective and Practical Algorithms
Federated Semi-Supervised Learning with Inter-Client Consistency & Disjoint Learning
FedMix: Approximation of Mixup under Mean Augmented Federated Learning
Few-Shot Bayesian Optimization with Deep Kernel Surrogates
Few-Shot Learning via Learning the Representation, Provably
Fidelity-based Deep Adiabatic Scheduling
Filtered Inner Product Projection for Crosslingual Embedding Alignment
Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization
Fooling a Complete Neural Network Verifier
For self-supervised learning, Rationality implies generalization, provably
Fourier Neural Operator for Parametric Partial Differential Equations
Free Lunch for Few-shot Learning: Distribution Calibration
Fully Unsupervised Diversity Denoising with Convolutional Variational Autoencoders
Fuzzy Tiling Activations: A Simple Approach to Learning Sparse Representations Online
GAN2GAN: Generative Noise Learning for Blind Denoising with Single Noisy Images
GANs Can Play Lottery Tickets Too
GAN "Steerability" without optimization
Gauge Equivariant Mesh CNNs: Anisotropic convolutions on geometric graphs
Generalization beyond the training distribution in brains and machines
Generalization bounds via distillation
Generalization in data-driven models of primary visual cortex
Generalized Energy Based Models
Generalized Multimodal ELBO
Generalized Variational Continual Learning
Generating Adversarial Computer Programs using Optimized Obfuscations
Generating Furry Cars: Disentangling Object Shape and Appearance across Multiple Domains
Generative Language-Grounded Policy in Vision-and-Language Navigation with Bayes' Rule
Generative Scene Graph Networks
Generative Time-series Modeling with Fourier Flows
Genetic Soft Updates for Policy Evolution in Deep Reinforcement Learning
Geometric and Topological Representation Learning
Geometric Deep Learning: the Erlangen Programme of ML
Geometry-Aware Gradient Algorithms for Neural Architecture Search
Geometry-aware Instance-reweighted Adversarial Training
Getting a CLUE: A Method for Explaining Uncertainty Estimates
Global Convergence of Three-layer Neural Networks in the Mean Field Regime
Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime
Go with the flow: Adaptive control for Neural ODEs
Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability
Gradient Origin Networks
Gradient Projection Memory for Continual Learning
Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models
gradSim: Differentiable simulation for system identification and visuomotor control
Graph-Based Continual Learning
Graph Coarsening with Neural Networks
GraphCodeBERT: Pre-training Code Representations with Data Flow
Graph Convolution with Low-rank Learnable Local Filters
Graph Edit Networks
Graph Information Bottleneck for Subgraph Recognition
Graph Traversal with Tensor Functionals: A Meta-Algorithm for Scalable Learning
GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing
Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved Complexity
Grounded Language Learning Fast and Slow
Grounding Language to Autonomously-Acquired Skills via Goal Generation
Grounding Physical Concepts of Objects and Events Through Dynamic Visual Reasoning
Group Equivariant Conditional Neural Processes
Group Equivariant Generative Adversarial Networks
Group Equivariant Stand-Alone Self-Attention For Vision
Growing Efficient Deep Networks by Structured Continuous Sparsification
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding
HalentNet: Multimodal Trajectory Forecasting with Hallucinative Intents
Hardware-Aware Efficient Training of Deep Learning Models
Heating up decision boundaries: isocapacitory saturation, adversarial scenarios and generalization bounds
HeteroFL: Computation and Communication Efficient Federated Learning for Heterogeneous Clients
Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization
Hierarchical Autoregressive Modeling for Neural Video Compression
Hierarchical Reinforcement Learning by Discovering Intrinsic Options
High-Capacity Expert Binary Networks
Hopfield Networks is All You Need
Hopper: Multi-hop Transformer for Spatiotemporal Reasoning
How Benign is Benign Overfitting ?
How Can Findings About The Brain Improve AI Systems?
How Does Mixup Help With Robustness and Generalization?
How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks
How to Find Your Friendly Neighborhood: Graph Attention Design with Self-Supervision
Human-Level Performance in No-Press Diplomacy via Equilibrium Search
HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark
Hyperbolic Neural Networks++
HyperDynamics: Meta-Learning Object and Agent Dynamics with Hypernetworks
HyperGrid Transformers: Towards A Single Model for Multiple Tasks
ICLR 2021 Workshop on Embodied Multimodal Learning (EML)
Identifying nonlinear dynamical systems with multiple time scales and long-range dependencies
Identifying Physical Law of Hamiltonian Systems via Meta-Learning
IDF++: Analyzing and Improving Integer Discrete Flows for Lossless Compression
IEPT: Instance-Level and Episode-Level Pretext Tasks for Few-Shot Learning
Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels
Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering
Impact of Representation Learning in Linear Bandits
Implicit Convex Regularizers of CNN Architectures: Convex Optimization of Two- and Three-Layer Networks in Polynomial Time
Implicit Gradient Regularization
Implicit Normalizing Flows
Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning
Improved Autoregressive Modeling with Distribution Smoothing
Improved Estimation of Concentration Under $\ell_p$-Norm Distance Metrics Using Half Spaces
Improve Object Detection with Feature-based Knowledge Distillation: Towards Accurate and Efficient Detectors
Improving Adversarial Robustness via Channel-wise Activation Suppressing
Improving Relational Regularized Autoencoders with Spherical Sliced Fused Gromov Wasserstein
Improving Transformation Invariance in Contrastive Representation Learning
Improving VAEs' Robustness to Adversarial Attack
Improving Zero-Shot Voice Style Transfer via Disentangled Representation Learning
Incorporating Symmetry into Deep Dynamics Models for Improved Generalization
Incremental few-shot learning via vector quantization in deep embedded space
In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label Selection Framework for Semi-Supervised Learning
Individually Fair Gradient Boosting
Individually Fair Rankings
Inductive Representation Learning in Temporal Networks via Causal Anonymous Walks
Influence Estimation for Generative Adversarial Networks
Influence Functions in Deep Learning Are Fragile
InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective
Information Laundering for Model Privacy
Initialization and Regularization of Factorized Neural Layers
In-N-Out: Pre-Training and Self-Training using Auxiliary Information for Out-of-Distribution Robustness
In Search of Lost Domain Generalization
INT: An Inequality Benchmark for Evaluating Generalization in Theorem Proving
Integrating Categorical Semantics into Unsupervised Domain Translation
Interactive Weak Supervision: Learning Useful Heuristics for Data Labeling
Interpretable Models for Granger Causality Using Self-explaining Neural Networks
Interpretable Neural Architecture Search via Bayesian Optimisation with Weisfeiler-Lehman Kernels
Interpreting and Boosting Dropout from a Game-Theoretic View
Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking
Interpreting Knowledge Graph Relation Representation from Word Embeddings
Into the Wild with AudioScope: Unsupervised Audio-Visual Separation of On-Screen Sounds
Intraclass clustering: an implicit learning ability that regularizes DNNs
Intrinsic-Extrinsic Convolution and Pooling for Learning on 3D Protein Structures
IOT: Instance-wise Layer Reordering for Transformer Structures
IsarStep: a Benchmark for High-level Mathematical Reasoning
Is Attention Better Than Matrix Decomposition?
Is Label Smoothing Truly Incompatible with Knowledge Distillation: An Empirical Study
Is My Dataset Biased?
Isometric Propagation Network for Generalized Zero-shot Learning
Isometric Transformation Invariant and Equivariant Graph Convolutional Networks
Isotropy in the Contextual Embedding Space: Clusters and Manifolds
Iterated learning for emergent systematicity in VQA
Iterative Empirical Game Solving via Single Policy Best Response
Kanerva++: Extending the Kanerva Machine With Differentiable, Locally Block Allocated Latent Memory
Knowledge Distillation as Semiparametric Inference
Knowledge distillation via softmax regression representation learning
LambdaNetworks: Modeling long-range Interactions without Attention
Language-Agnostic Representation Learning of Source Code from Structure and Context
Large Associative Memory Problem in Neurobiology and Machine Learning
Large Batch Simulation for Deep Reinforcement Learning
Large Scale Image Completion via Co-Modulated Generative Adversarial Networks
Large-width functional asymptotics for deep Gaussian neural networks
Latent Convergent Cross Mapping
Latent Skill Planning for Exploration and Transfer
Layer-adaptive Sparsity for the Magnitude-based Pruning
LEAF: A Learnable Frontend for Audio Classification
Learnable Embedding sizes for Recommender Systems
Learning Accurate Entropy Model with Global Reference for Image Compression
Learning advanced mathematical computations from examples
Learning a Latent Search Space for Routing Problems using Variational Autoencoders
Learning a Latent Simplex in Input Sparsity Time
Learning A Minimax Optimizer: A Pilot Study
Learning and Evaluating Representations for Deep One-Class Classification
Learning Associative Inference Using Fast Weight Memory
Learning-based Support Estimation in Sublinear Time
Learning Better Structured Representations Using Low-rank Adaptive Label Smoothing
Learning continuous-time PDEs from sparse data with graph neural networks
Learning Cross-Domain Correspondence for Control with Dynamics Cycle-Consistency
Learning Deep Features in Instrumental Variable Regression
Learning Energy-Based Generative Models via Coarse-to-Fine Expanding and Sampling
Learning Energy-Based Models by Diffusion Recovery Likelihood
Learning explanations that are hard to vary
Learning from Demonstration with Weakly Supervised Disentanglement
Learning from others' mistakes: Avoiding dataset biases without modeling them
Learning from Protein Structure with Geometric Vector Perceptrons
Learning Generalizable Visual Representations via Interactive Gameplay
Learning Hyperbolic Representations of Topological Features
Learning Incompressible Fluid Dynamics from Scratch - Towards Fast, Differentiable Fluid Models that Generalize
Learning Invariant Representations for Reinforcement Learning without Reconstruction
Learning Long-term Visual Dynamics with Region Proposal Interaction Networks
Learning Manifold Patch-Based Representations of Man-Made Shapes
Learning Mesh-Based Simulation with Graph Networks
Learning Neural Event Functions for Ordinary Differential Equations
Learning Neural Generative Dynamics for Molecular Conformation Generation
Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch
Learning Parametrised Graph Shift Operators
Learning perturbation sets for robust machine learning
Learning Reasoning Paths over Semantic Graphs for Video-grounded Dialogues
Learning Robust State Abstractions for Hidden-Parameter Block MDPs
Learning Safe Multi-agent Control with Decentralized Neural Barrier Certificates
Learning Structural Edits via Incremental Tree Transformations
Learning Subgoal Representations with Slow Dynamics
Learning Task Decomposition with Ordered Memory Policy Network
Learning Task-General Representations with Generative Neuro-Symbolic Modeling
Learning the Pareto Front with Hypernetworks
Learning to Deceive Knowledge Graph Augmented Models via Targeted Perturbation
Learning to Generate 3D Shapes with Generative Cellular Automata
Learning to live with Dale's principle: ANNs with separate excitatory and inhibitory units
Learning to Make Decisions via Submodular Regularization
Learning to Reach Goals via Iterated Supervised Learning
Learning to Recombine and Resample Data For Compositional Generalization
Learning to Represent Action Values as a Hypergraph on the Action Vertices
Learning to Sample with Local and Global Contexts in Experience Replay Buffer
Learning to Set Waypoints for Audio-Visual Navigation
Learning Value Functions in Deep Policy Gradients using Residual Variance
Learning "What-if" Explanations for Sequential Decision-Making
Learning What To Do by Simulating the Past
Learning with AMIGo: Adversarially Motivated Intrinsic Goals
Learning with Feature-Dependent Label Noise: A Progressive Approach
Learning with Instance-Dependent Label Noise: A Sample Sieve Approach
Lifelong Learning of Compositional Structures
LiftPool: Bidirectional ConvNet Pooling
Linear Convergent Decentralized Optimization with Compression
Linear Last-iterate Convergence in Constrained Saddle-point Optimization
Linear Mode Connectivity in Multitask and Continual Learning
Lipschitz Recurrent Neural Networks
Local Convergence Analysis of Gradient Descent Ascent with Finite Timescale Separation
Locally Free Weight Sharing for Network Width Search
Local Search Algorithms for Rank-Constrained Convex Optimization
Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning
Long Range Arena : A Benchmark for Efficient Transformers
Long-tailed Recognition by Routing Diverse Distribution-Aware Experts
Long-tail learning via logit adjustment
Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search
Lossless Compression of Structured Convolutional Models via Lifting
LowKey: Leveraging Adversarial Attacks to Protect Social Media Users from Facial Recognition
Machine Learning for Preventing and Combating Pandemics
MALI: A memory efficient and reverse accurate integrator for Neural ODEs
Mapping the Timescale Organization of Neural Language Models
MARS: Markov Molecular Sampling for Multi-objective Drug Discovery
Mastering Atari with Discrete World Models
Mathematical Reasoning via Self-supervised Skip-tree Training
Measuring Massive Multitask Language Understanding
MELR: Meta-Learning via Modeling Episode-Level Relationships for Few-Shot Learning
Memory Optimization for Deep Networks
Meta Back-Translation
Meta-GMVAE: Mixture of Gaussian VAE for Unsupervised Meta-Learning
Meta-Learning of Structured Task Distributions in Humans and Machines
Meta-learning Symmetries by Reparameterization
Meta-learning with negative learning rates
Meta-Learning with Neural Tangent Kernels
MetaNorm: Learning to Normalize Few-Shot Batches Across Domains
MiCE: Mixture of Contrastive Experts for Unsupervised Image Clustering
Mind the Gap when Conditioning Amortised Inference in Sequential Latent-Variable Models
Mind the Pad -- CNNs Can Develop Blind Spots
Minimum Width for Universal Approximation
Mirostat: A Neural Text Decoding Algorithm That Directly Controls Perplexity
Mixed-Features Vectors and Subspace Splitting
MixKD: Towards Efficient Distillation of Large-scale Language Models
MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space
Model-based micro-data reinforcement learning: what are the crucial model properties and which model to choose?
Model-Based Offline Planning
Model-Based Visual Planning with Self-Supervised Functional Distances
Modeling the Second Player in Distributionally Robust Optimization
Modelling Hierarchical Structure between Dialogue Policy and Natural Language Generator with Option Framework for Task-oriented Dialogue System
Model Patching: Closing the Subgroup Performance Gap with Data Augmentation
Molecule Optimization by Explainable Evolution
MONGOOSE: A Learnable LSH Framework for Efficient Neural Network Training
Monotonic Kronecker-Factored Lattice
Monte-Carlo Planning and Learning with Language Action Value Estimates
MoPro: Webly Supervised Learning with Momentum Prototypes
More or Less: When and How to Build Convolutional Neural Network Ensembles
MoVie: Revisiting Modulated Convolutions for Visual Counting and Beyond
Moving beyond the fairness rhetoric in machine learning
Multi-Class Uncertainty Calibration via Mutual Information Maximization-based Binning
Multi-Level Local SGD: Distributed SGD for Heterogeneous Hierarchical Networks
MultiModalQA: complex question answering over text, tables and images
Multiplicative Filter Networks
Multi-Prize Lottery Ticket Hypothesis: Finding Accurate Binary Neural Networks by Pruning A Randomly Weighted Network
Multi-resolution modeling of a discrete stochastic process identifies causes of cancer
Multiscale Score Matching for Out-of-Distribution Detection
Multi-Time Attention Networks for Irregularly Sampled Time Series
Multi-timescale Representation Learning in LSTM Language Models
Multivariate Probabilistic Time Series Forecasting via Conditioned Normalizing Flows
Mutual Information State Intrinsic Control
My Body is a Cage: the Role of Morphology in Graph-Based Incompatible Control
NAS-Bench-ASR: Reproducible Neural Architecture Search for Speech Recognition
NBDT: Neural-Backed Decision Tree
Nearest Neighbor Machine Translation
Negative Data Augmentation
NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation
Net-DNF: Effective Deep Modeling of Tabular Data
Network Pruning That Matters: A Case Study on Retraining Variants
Neural Approximate Sufficient Statistics for Implicit Models
Neural Architecture Search on ImageNet in Four GPU Hours: A Theoretically Inspired Perspective
Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks
Neural Compression: From Information Theory to Applications
Neural Conversational AI: Bridging the Gap Between Research and Real World (NeuCAIR)
Neural Delay Differential Equations
Neural gradients are near-lognormal: improved quantized and sparse training
Neural Jump Ordinary Differential Equations: Consistent Continuous-Time Prediction and Filtering
Neural Learning of One-of-Many Solutions for Combinatorial Problems in Structured Output Spaces
Neurally Augmented ALISTA
Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics
Neural Networks for Learning Counterfactual G-Invariances from Single Environments
Neural networks with late-phase weights
Neural ODE Processes
Neural Pruning via Growing Regularization
Neural representation and generation for RNA secondary structures
Neural Spatio-Temporal Point Processes
Neural Synthesis of Binaural Speech From Mono Audio
Neural Thompson Sampling
Neural Topic Model via Optimal Transport
New Bounds For Distributed Mean Estimation and Variance Reduction
No Cost Likelihood Manipulation at Test Time for Making Better Mistakes in Deep Networks
Noise against noise: stochastic label noise helps combat inherent label noise
Noise or Signal: The Role of Image Backgrounds in Object Recognition
No MCMC for me: Amortized sampling for fast and stable training of energy-based models
Non-asymptotic Confidence Intervals of Off-policy Evaluation: Primal and Dual Bounds
Nonseparable Symplectic Neural Networks
not-MIWAE: Deep Generative Modelling with Missing not at Random Data
NOVAS: Non-convex Optimization via Adaptive Stochastic Search for End-to-end Learning and Control
Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers
Offline Model-Based Optimization via Normalized Maximum Likelihood Estimation
On Data-Augmentation and Consistency-Based Semi-Supervised Learning
On Dyadic Fairness: Exploring and Mitigating Bias in Graph Connections
One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks
On Fast Adversarial Robustness Adaptation in Model-Agnostic Meta-Learning
On Graph Neural Networks versus Graph-Augmented MLPs
On InstaHide, Phase Retrieval, and Sparse Matrix Factorization
On Learning Universal Representations Across Languages
Online Adversarial Purification based on Self-supervised Learning
On Position Embeddings in BERT
On Self-Supervised Image Representations for GAN Evaluation
On Statistical Bias In Active Learning: How and When to Fix It
On the Bottleneck of Graph Neural Networks and its Practical Implications
On the Critical Role of Conventions in Adaptive Human-AI Collaboration
On the Curse of Memory in Recurrent Neural Networks: Approximation and Optimization Analysis
On the Dynamics of Training Attention Models
On the geometry of generalization and memorization in deep neural networks
On the Impossibility of Global Convergence in Multi-Loss Optimization
On the mapping between Hopfield networks and Restricted Boltzmann Machines
On the Origin of Implicit Regularization in Stochastic Gradient Descent
On the role of planning in model-based deep reinforcement learning
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines
On the Theory of Implicit Deep Learning: Global Convergence with Implicit Layers
On the Transfer of Disentangled Representations in Realistic Settings
On the Universality of Rotation Equivariant Point Cloud Networks
On the Universality of the Double Descent Peak in Ridgeless Regression
OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning
Open Question Answering over Tables and Text
Optimal Conversion of Conventional Artificial Neural Networks to Spiking Neural Networks
Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime
Optimal Regularization can Mitigate Double Descent
Optimism in Reinforcement Learning with Generalized Linear Function Approximation
Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning
Orthogonalizing Convolutional Layers with the Cayley Transform
Overfitting for Fun and Profit: Instance-Adaptive Data Compression
Overparameterisation and worst-case generalisation: friend or foe?
PAC Confidence Predictions for Deep Neural Network Classifiers
Parameter-Based Value Functions
Parameter Efficient Multimodal Transformers for Video Representation Learning
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning
Partitioned Learned Bloom Filters
PC2WF: 3D Wireframe Reconstruction from Raw Point Clouds
PDE-Driven Spatiotemporal Disentanglement
Perceiving the 3D World from Images and Video
Perceptual Adversarial Robustness: Defense Against Unseen Threat Models
Personalized Federated Learning with First Order Model Optimization
Physics-aware, probabilistic model order reduction with guaranteed stability
Plan-Based Relaxed Reward Shaping for Goal-Directed Tasks
Planning from Pixels using Inverse Dynamics Models
PlasticineLab: A Soft-Body Manipulation Benchmark with Differentiable Physics
PMI-Masking: Principled masking of correlated spans
PolarNet: Learning to Optimize Polar Keypoints for Keypoint Based Object Detection
Policy-Driven Attack: Learning to Query for Hard-label Black-box Adversarial Examples
Practical Massively Parallel Monte-Carlo Tree Search Applied to Molecular Design
Practical Real Time Recurrent Learning with a Sparse Approximation
Predicting Classification Accuracy When Adding New Unobserved Classes
Predicting Inductive Biases of Pre-Trained Models
Predicting Infectiousness for Proactive Contact Tracing
Prediction and generalisation over directed actions by grid cells
Pre-training Text-to-Text Transformers for Concept-centric Common Sense
Primal Wasserstein Imitation Learning
Private Image Reconstruction from System Side Channels Using Generative Models
Private Post-GAN Boosting
Probabilistic Numeric Convolutional Neural Networks
Probing BERT in Hyperbolic Spaces
Progressive Skeletonization: Trimming more fat from a network at initialization
Projected Latent Markov Chain Monte Carlo: Conditional Sampling of Normalizing Flows
Property Controllable Variational Autoencoder via Invertible Mutual Dependence
Protecting DNNs from Theft using an Ensemble of Diverse Models
Prototypical Contrastive Learning of Unsupervised Representations
Prototypical Representation Learning for Relation Extraction
Provable Rich Observation Reinforcement Learning with Combinatorial Latent States
Provably robust classification of adversarial examples with detection
Proximal Gradient Descent-Ascent: Variable Convergence under KŁ Geometry
Pruning Neural Networks at Initialization: Why Are We Missing the Mark?
PseudoSeg: Designing Pseudo Labels for Semantic Segmentation
PSTNet: Point Spatio-Temporal Convolution on Point Cloud Sequences
QPLEX: Duplex Dueling Multi-Agent Q-Learning
Quantifying Differences in Reward Functions
Random Feature Attention
Randomized Automatic Differentiation
Randomized Ensembled Double Q-Learning: Learning Fast Without a Model
Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments
Rao-Blackwellizing the Straight-Through Gumbel-Softmax Gradient Estimator
Rapid Neural Architecture Search by Learning to Generate Graphs from Datasets
Rapid Task-Solving in Novel Environments
Recurrent Independent Mechanisms
Reducing the Computational Cost of Deep Generative Models with Binary Neural Networks
Refining Deep Generative Models via Discriminator Gradient Flow
Regularization Matters in Policy Optimization - An Empirical Study on Continuous Control
Regularized Inverse Reinforcement Learning
Reinforcement Learning with Random Delays
Relating by Contrasting: A Data-efficient Framework for Multimodal Generative Models
Remembering for the Right Reasons: Explanations Reduce Catastrophic Forgetting
Removing Undesirable Feature Contributions Using Out-of-Distribution Data
Representation Balancing Offline Model-based Reinforcement Learning
Representation learning for improved interpretability and classification accuracy of clinical factors from EEG
Representation Learning for Sequence Data with Deep Autoencoding Predictive Components
Representation Learning via Invariant Causal Mechanisms
Representing Partial Programs with Blended Abstract Semantics
Repurposing Pretrained Models for Robust Out-of-domain Few-Shot Learning
Reset-Free Lifelong Learning with Skill-Space Planning
ResNet After All: Neural ODEs and Their Numerical Solution
Responsible AI (RAI)
Rethinking Architecture Selection in Differentiable NAS
Rethinking Attention with Performers
Rethinking Embedding Coupling in Pre-trained Language Models
Rethinking Positional Encoding in Language Pre-training
Rethinking Soft Labels for Knowledge Distillation: A Bias–Variance Tradeoff Perspective
Rethinking the Role of Gradient-based Attribution Methods for Model Interpretability
Retrieval-Augmented Generation for Code Summarization via Hybrid GNN
Return-Based Contrastive Representation Learning for Reinforcement Learning
Revisiting Dynamic Convolution via Matrix Decomposition
Revisiting Few-sample BERT Fine-tuning
Revisiting Hierarchical Approach for Persistent Long-Term Video Prediction
Revisiting Locally Supervised Learning: an Alternative to End-to-end Training
Reweighting Augmented Samples by Minimizing the Maximal Expected Loss
R-GAP: Recursive Gradient Attack on Privacy
Ringing ReLUs: Harmonic Distortion Analysis of Nonlinear Feedforward Networks
Risk-Averse Offline Reinforcement Learning
RMSprop converges with proper hyper-parameter
RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphs
Robust and Generalizable Visual Representation Learning via Random Convolutions
Robust and reliable machine learning in the real world
Robust Curriculum Learning: from clean label detection to noisy label self-correction
Robust early-learning: Hindering the memorization of noisy labels
Robust Learning of Fixed-Structure Bayesian Networks in Nearly-Linear Time
Robust Overfitting may be mitigated by properly learned smoothening
Robust Pruning at Initialization
Robust Reinforcement Learning on State Observations with Learned Optimal Adversary
RODE: Learning Roles to Decompose Multi-Agent Tasks
S2D-OLAD: From shallow to deep, overcoming limited and adverse data
SAFENet: A Secure, Accurate and Fast Neural Network Inference
SALD: Sign Agnostic Learning with Derivatives
Saliency is a Possible Red Herring When Diagnosing Poor Generalization
SaliencyMix: A Saliency Guided Data Augmentation Strategy for Better Regularization
Sample-Efficient Automated Deep Reinforcement Learning
Scalable Bayesian Inverse Reinforcement Learning
Scalable Learning and MAP Inference for Nonsymmetric Determinantal Point Processes
Scalable Transfer Learning with Expert Models
Scaling Symbolic Methods using Gradients for Neural Model Explanation
Scaling the Convex Barrier with Active Sets
Science and Engineering of Deep Learning
Score-Based Generative Modeling through Stochastic Differential Equations
SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing
Security and Safety in Machine Learning Systems
SEDONA: Search for Decoupled Neural Networks toward Greedy Block-wise Learning
SEED: Self-supervised Distillation For Visual Representation
Selective Classification Can Magnify Disparities Across Groups
Selectivity considered harmful: evaluating the causal impact of class selectivity in DNNs
Self-supervised Adversarial Robustness for the Low-label, High-data Regime
Self-supervised Learning from a Multi-view Perspective
Self-Supervised Learning of Compressed Video Representations
Self-Supervised Policy Adaptation during Deployment
Self-supervised Representation Learning with Relative Predictive Coding
Self-supervised Visual Reinforcement Learning with Object-centric Representations
Self-Supervision for Learning from the Bottom Up
Self-Supervision for Reinforcement Learning
Self-training For Few-shot Transfer Across Extreme Task Differences
Semantic Re-tuning with Contrastive Tension
Semi-supervised Keypoint Localization
SenSeI: Sensitive Set Invariance for Enforcing Individual Fairness
Separation and Concentration in Deep Networks
Seq2Tens: An Efficient Representation of Sequences by Low-Rank Tensor Projections
Sequential Density Ratio Estimation for Simultaneous Optimization of Speed and Accuracy
Set Prediction without Imposing Structure as Conditional Density Estimation
Shape or Texture: Understanding Discriminative Features in CNNs
Shape-Texture Debiased Neural Network Training
Shapley explainability on the data manifold
Shapley Explanation Networks
Share or Not? Learning to Schedule Language-Specific Capacity for Multilingual Translation
Sharper Generalization Bounds for Learning with Gradient-dominated Objective Functions
Sharpness-aware Minimization for Efficiently Improving Generalization
Signatory: differentiable computations of the signature and logsignature transforms, on both CPU and GPU
Simple Augmentation Goes a Long Way: ADRL for DNN Quantization
Simple Spectral Graph Convolution
Single-Photon Image Classification
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
SkipW: Resource Adaptable RNN with Strict Upper Computational Limit
Sliced Kernelized Stein Discrepancy
SMiRL: Surprise Minimizing Reinforcement Learning in Unstable Environments
Soft bodied robots for human centered design of robots for everyday life
SOLAR: Sparse Orthogonal Learned and Random Embeddings
Solving Compositional Reinforcement Learning Problems via Task Reduction
Sparse encoding for more-interpretable feature-selecting representations in probabilistic matrix factorization
Sparse Quantized Spectral Clustering
Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling
Spatially Structured Recurrent Modules
Spatio-Temporal Graph Scattering Transform
SSD: A Unified Framework for Self-Supervised Outlier Detection
Stabilized Medical Image Attacks
Statistical inference for individual fairness
Stochastic Security: Adversarial Defense Using Long-Run Dynamics of Energy-Based Models
Structured Prediction as Translation between Augmented Natural Languages
Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning
Support-set bottlenecks for video-text representation learning
Symmetry-Aware Actor-Critic for 3D Molecular Design
Synthetic Data Generation: Quality, Privacy, Bias
Systematic generalisation with group invariant predictions
Taking Notes on the Fly Helps Language Pre-Training
Taming GANs with Lookahead-Minmax
Targeted Attack against Deep Neural Networks via Flipping Limited Weight Bits
Task-Agnostic Morphology Evolution
Teaching Temporal Logics to Neural Networks
Teaching with Commentaries
Temporally-Extended ε-Greedy Exploration
Tent: Fully Test-Time Adaptation by Entropy Minimization
Text Generation by Learning from Demonstrations
The Deep Bootstrap Framework: Good Online Learners are Good Offline Generalizers
The geometry of integration in text classification RNNs
The Importance of Pessimism in Fixed-Dataset Policy Optimization
The inductive bias of ReLU networks on orthogonally separable data
The Intrinsic Dimension of Images and Its Impact on Learning
Theoretical Analysis of Self-Training with Deep Networks on Unlabeled Data
Theoretical bounds on estimation error for meta-learning
The Recurrent Neural Tangent Kernel
The Risks of Invariant Risk Minimization
The role of Disentanglement in Generalisation
The Role of Mathematical Reasoning in General Artificial Intelligence
The Role of Momentum Parameters in the Optimal Convergence of Adaptive Polyak's Heavy-ball Methods
The Traveling Observer Model: Multi-task Learning Through Spatial Variable Embeddings
The Unreasonable Effectiveness of Patches in Deep Convolutional Kernels Methods
Tilted Empirical Risk Minimization
Tomographic Auto-Encoder: Unsupervised Bayesian Recovery of Corrupted Data
Topology-Aware Segmentation Using Discrete Morse Theory
Towards Faster and Stabilized GAN Training for High-fidelity Few-shot Image Synthesis
Towards Impartial Multi-task Learning
Towards Nonlinear Disentanglement in Natural Data with Temporal Sparse Coding
Towards Resolving the Implicit Bias of Gradient Descent for Matrix Factorization: Greedy Low-Rank Learning
Towards Robustness Against Natural Language Word Substitutions
Towards Robust Neural Networks via Close-loop Control
Tradeoffs in Data Augmentation: An Empirical Study
Training BatchNorm and Only BatchNorm: On the Expressive Power of Random Features in CNNs
Training GANs with Stronger Augmentations via Contrastive Discriminator
Training independent subnetworks for robust prediction
Training with Quantization Noise for Extreme Model Compression
Trajectory Prediction using Equivariant Continuous Convolution
Transformer protein language models are unsupervised structure learners
Transient Non-stationarity and Generalisation in Deep Reinforcement Learning
TropEx: An Algorithm for Extracting Linear Terms in Deep Neural Networks
Trusted Multi-View Classification
UMEC: Unified model and embedding compression for efficient recommendation systems
Unbiased Teacher for Semi-Supervised Object Detection
Uncertainty-aware Active Learning for Optimal Bayesian Classifier
Uncertainty Estimation and Calibration with Finite-State Probabilistic RNNs
Uncertainty Estimation in Autoregressive Structured Prediction
Uncertainty in Gradient Boosting via Ensembles
Uncertainty Sets for Image Classifiers using Conformal Prediction
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning
Understanding and Improving Lexical Choice in Non-Autoregressive Translation
Understanding Over-parameterization in Generative Adversarial Networks
Understanding the effects of data parallelism and sparsity on neural network training
Understanding the failure modes of out-of-distribution generalization
Understanding the role of importance weighting for deep learning
Undistillable: Making A Nasty Teacher That CANNOT teach students
Universal approximation power of deep residual neural networks via nonlinear control theory
Universal Weakly Supervised Segmentation by Pixel-to-Segment Contrastive Learning
Unlearnable Examples: Making Personal Data Unexploitable
Unsupervised Audiovisual Synthesis via Exemplar Autoencoders
Unsupervised Discovery of 3D Physical Objects
Unsupervised Meta-Learning through Latent-Space Interpolation in Generative Models
Unsupervised Object Keypoint Learning using Local Spatial Predictability
Unsupervised Representation Learning for Time Series with Temporal Neighborhood Coding
UPDeT: Universal Multi-agent RL via Policy Decoupling with Transformers
Usable Information and Evolution of Optimal Representations During Training
Using latent space regression to analyze and leverage compositionality in GANs
VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models
VA-RED$^2$: Video Adaptive Redundancy Reduction
Variational Information Bottleneck for Effective Low-Resource Fine-Tuning
Variational Intrinsic Control Revisited
Variational State-Space Models for Localisation and Dense 3D Mapping in 6 DoF
VCNet and Functional Targeted Regularization For Learning Causal Effects of Continuous Treatments
Vector-output ReLU Neural Network Problems are Copositive Programs: Convex Analysis of Two Layer Networks and Polynomial-time Algorithms
Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images
Viewmaker Networks: Learning Views for Unsupervised Representation Learning
VTNet: Visual Transformer Network for Object Goal Navigation
Vulnerability-Aware Poisoning Mechanism for Online RL with Unknown Dynamics
Wandering within a world: Online contextualized few-shot learning
WaNet - Imperceptible Warping-based Backdoor Attack
Wasserstein-2 Generative Networks
Wasserstein Embedding for Graph Learning
Watch-And-Help: A Challenge for Social Perception and Human-AI Collaboration
WaveGrad: Estimating Gradients for Waveform Generation
What are the Statistical Limits of Offline RL with Linear Function Approximation?
What Can You Learn From Your Muscles? Learning Visual Representation from Human Interactions
What Makes Instance Discrimination Good for Transfer Learning?
What Matters for On-Policy Deep Actor-Critic Methods? A Large-Scale Study
What Should Not Be Contrastive in Contrastive Learning
What they do when in doubt: a study of inductive biases in seq2seq learners
When Do Curricula Work?
When does preconditioning help or hurt generalization?
When Optimizing $f$-Divergence is Robust with Label Noise
Why Are Convolutional Nets More Sample-Efficient than Fully-Connected Nets?
Why resampling outperforms reweighting for correcting sampling bias with stochastic gradients
Winning the L2RPN Challenge: Power Grid Management via Semi-Markov Afterstate Actor-Critic
Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching
Workshop on Distributed and Private Machine Learning
Workshop on Enormous Language Models: Perspectives and Benchmarks
Workshop on Learning to Learn
Workshop on Neural Architecture Search
Workshop on Weakly Supervised Learning
WrapNet: Neural Net Inference with Ultra-Low-Precision Arithmetic
X2T: Training an X-to-Text Typing Interface with Online Learning from User Feedback
You Only Need Adversarial Supervision for Semantic Image Synthesis
Zero-Cost Proxies for Lightweight NAS
Zero-shot Synthesis with Group-Supervised Learning