Topic Keywords

[ $\ell_1$ norm ] [ $f-$divergence ] [ 3D Convolution ] [ 3D deep learning ] [ 3D generation ] [ 3d point cloud ] [ 3D Reconstruction ] [ 3D scene understanding ] [ 3D shape representations ] [ 3D shapes learning ] [ 3D vision ] [ 3D Vision ] [ abstract reasoning ] [ abstract rules ] [ Acceleration ] [ accuracy ] [ acoustic condition modeling ] [ Action localization ] [ action recognition ] [ activation maximization ] [ activation strategy. ] [ Active learning ] [ Active Learning ] [ AdaBoost ] [ adaptive heavy-ball methods ] [ Adaptive Learning ] [ adaptive methods ] [ adaptive optimization ] [ ADMM ] [ Adversarial Accuracy ] [ Adversarial Attack ] [ Adversarial Attacks ] [ adversarial attacks/defenses ] [ Adversarial computer programs ] [ Adversarial Defense ] [ Adversarial Example Detection ] [ Adversarial Examples ] [ Adversarial Learning ] [ Adversarial Machine Learning ] [ adversarial patch ] [ Adversarial robustness ] [ Adversarial Robustness ] [ Adversarial training ] [ Adversarial Training ] [ Adversarial Transferability ] [ aesthetic assessment ] [ affine parameters ] [ age estimation ] [ Aggregation Methods ] [ AI for earth science ] [ ALFRED ] [ Algorithm ] [ algorithmic fairness ] [ Algorithmic fairness ] [ Algorithms ] [ alignment ] [ alignment of semantic and visual space ] [ amortized inference ] [ Analogies ] [ annotation artifacts ] [ anomaly-detection ] [ Anomaly detection with deep neural networks ] [ anonymous walk ] [ appearance transfer ] [ approximate constrained optimization ] [ approximation ] [ Approximation ] [ Architectures ] [ argoverse ] [ Artificial Integlligence ] [ ASR ] [ assistive technology ] [ associative memory ] [ Associative Memory ] [ asynchronous parallel algorithm ] [ Atari ] [ Attention ] [ Attention Mechanism ] [ Attention Modules ] [ attractors ] [ attributed walks ] [ Auction Theory ] [ audio understanding ] [ Audio-Visual ] [ audio visual learning ] [ audio-visual representation ] [ audio-visual representation learning ] [ Audio-visual sound separation ] [ audiovisual synthesis ] [ augmented deep reinforcement learning ] [ autodiff ] [ Autoencoders ] [ automated data augmentation ] [ automated machine learning ] [ automatic differentiation ] [ AutoML ] [ autonomous learning ] [ autoregressive language model ] [ Autoregressive Models ] [ AutoRL ] [ auxiliary information ] [ auxiliary latent variable ] [ Auxiliary Learning ] [ auxiliary task ] [ Average-case Analysis ] [ aversarial examples ] [ avoid knowledge leaking ] [ backdoor attack ] [ Backdoor Attacks ] [ Backdoor Defense ] [ Backgrounds ] [ backprop ] [ back translation ] [ backward error analysis ] [ bagging ] [ batchnorm ] [ Batch Normalization ] [ batch reinforcement learning ] [ Batch Reinforcement Learning ] [ batch selection ] [ Bayesian ] [ Bayesian classification ] [ Bayesian inference ] [ Bayesian Inference ] [ Bayesian networks ] [ Bayesian Neural Networks ] [ behavior cloning ] [ belief-propagation ] [ Benchmark ] [ benchmarks ] [ benign overfitting ] [ bert ] [ BERT ] [ beta-VAE ] [ better generalization ] [ biased sampling ] [ biases ] [ Bias in Language Models ] [ bidirectional ] [ bilevel optimization ] [ Bilinear games ] [ Binary Embeddings ] [ Binary Neural Networks ] [ binaural audio ] [ binaural speech ] [ biologically plausible ] [ Biometrics ] [ bisimulation ] [ Bisimulation ] [ bisimulation metrics ] [ bit-flip ] [ bit-level sparsity ] [ blind denoising ] [ blind spots ] [ block mdp ] [ boosting ] [ bottleneck ] [ bptt ] [ branch and bound ] [ Brownian motion ] [ Budget-Aware Pruning ] [ Budget constraints ] [ Byzantine resilience ] [ Byzantine SGD ] [ CAD modeling ] [ calibration ] [ Calibration ] [ calibration measure ] [ cancer research ] [ Capsule Networks ] [ Catastrophic forgetting ] [ Catastrophic Forgetting ] [ Causal Inference ] [ Causality ] [ Causal network ] [ certificate ] [ certified defense ] [ Certified Robustness ] [ challenge sets ] [ change of measure ] [ change point detection ] [ channel suppressing ] [ Channel Tensorization ] [ Channel-Wise Approximated Activation ] [ Chaos ] [ chebyshev polynomial ] [ checkpointing ] [ Checkpointing ] [ chemistry ] [ CIFAR ] [ Classification ] [ class imbalance ] [ clean-label ] [ Clustering ] [ Clusters ] [ CNN ] [ CNNs ] [ Code Compilation ] [ Code Representations ] [ Code Structure ] [ code summarization ] [ Code Summarization ] [ Cognitively-inspired Learning ] [ cold posteriors ] [ collaborative learning ] [ Combinatorial optimization ] [ common object counting ] [ commonsense question answering ] [ Commonsense Reasoning ] [ Communication Compression ] [ co-modulation ] [ complete verifiers ] [ complex query answering ] [ Composition ] [ compositional generalization ] [ compositional learning ] [ compositional task ] [ Compressed videos ] [ Compressing Deep Networks ] [ Compression ] [ computation ] [ computational biology ] [ Computational Biology ] [ computational complexity ] [ Computational imaging ] [ Computational neuroscience ] [ Computational resources ] [ computer graphics ] [ Computer Vision ] [ concentration ] [ Concentration of Measure ] [ Concept-based Explanation ] [ concept drift ] [ Concept Learning ] [ conditional expectation ] [ Conditional GANs ] [ Conditional Generation ] [ Conditional generative adversarial networks ] [ conditional layer normalization ] [ Conditional Neural Processes ] [ Conditional Risk Minimization ] [ Conditional Sampling ] [ conditional text generation ] [ Conferrability ] [ confidentiality ] [ conformal inference ] [ conformal prediction ] [ conjugacy ] [ conservation law ] [ consistency ] [ consistency training ] [ Consistency Training ] [ constellation models ] [ constrained beam search ] [ Constrained optimization ] [ constrained RL ] [ constraints ] [ constraint satisfaction ] [ contact tracing ] [ Contextual Bandits ] [ Contextual embedding space ] [ Continual learning ] [ Continual Learning ] [ continuation method ] [ continuous and scalar conditions ] [ continuous case ] [ Continuous Control ] [ continuous convolution ] [ continuous games ] [ continuous normalizing flow ] [ continuous time ] [ Continuous-time System ] [ continuous treatment effect ] [ contrastive divergence ] [ Contrastive learning ] [ Contrastive Learning ] [ Contrastive Methods ] [ contrastive representation learning ] [ control barrier function ] [ controlled generation ] [ Controlled NLG ] [ Convergence ] [ Convergence Analysis ] [ convex duality ] [ Convex optimization ] [ ConvNets ] [ convolutional kernel methods ] [ Convolutional Layer ] [ convolutional models ] [ Convolutional Networks ] [ copositive programming ] [ corruptions ] [ COST ] [ Counterfactual inference ] [ counterfactuals ] [ Counterfactuals ] [ covariant neural networks ] [ covid-19 ] [ COVID-19 ] [ Cross-domain ] [ cross-domain few-shot learning ] [ cross-domain video generation ] [ cross-episode attention ] [ cross-fitting ] [ cross-lingual pretraining ] [ Cryptographic inference ] [ cultural transmission ] [ Curriculum Learning ] [ curse of memory ] [ curvature estimates ] [ custom voice ] [ cycle-consistency regularization ] [ cycle-consistency regularizer ] [ DAG ] [ DARTS stability ] [ Data augmentation ] [ Data Augmentation ] [ data cleansing ] [ Data-driven modeling ] [ data-efficient learning ] [ data-efficient RL ] [ Data Flow ] [ data labeling ] [ data parallelism ] [ Data Poisoning ] [ Data Protection ] [ Dataset ] [ dataset bias ] [ dataset compression ] [ dataset condensation ] [ dataset corruption ] [ dataset distillation ] [ dataset summarization ] [ data structures ] [ debiased training ] [ debugging ] [ Decentralized Optimization ] [ decision boundary geometry ] [ decision trees ] [ declarative knowledge ] [ deep-anomaly-detection ] [ Deep Architectures ] [ Deep denoising priors ] [ deep embedding ] [ Deep Ensembles ] [ deep equilibrium models ] [ Deep Equilibrium Models ] [ Deepfake ] [ deep FBSDEs ] [ Deep Gaussian Processes ] [ Deep generative model ] [ Deep generative modeling ] [ Deep generative models ] [ deeplearning ] [ Deep learning ] [ Deep Learning ] [ deep learning dynamics ] [ Deep Learning Theory ] [ deep network training ] [ deep neural network ] [ deep neural networks. ] [ Deep Neural Networks ] [ deep one-class classification ] [ deep Q-learning ] [ Deep reinforcement learning ] [ Deep Reinforcement Learning ] [ deep ReLU networks ] [ Deep residual neural networks ] [ deep RL ] [ deep sequence model ] [ deepset ] [ Deep Sets ] [ Deformation Modeling ] [ delay ] [ Delay differential equations ] [ denoising score matching ] [ Dense Retrieval ] [ Density estimation ] [ Density Estimation ] [ Density ratio estimation ] [ dependency based method ] [ deployment-efficiency ] [ depression ] [ depth separation ] [ descent ] [ description length ] [ determinantal point processes ] [ Device Placement ] [ dialogue state tracking ] [ differentiable optimization ] [ Differentiable physics ] [ Differentiable Physics ] [ Differentiable program generator ] [ differentiable programming ] [ Differentiable rendering ] [ Differentiable simulation ] [ differential dynamica programming ] [ differential equations ] [ Differential Geometry ] [ differentially private deep learning ] [ Differential Privacy ] [ diffusion probabilistic models ] [ diffusion process ] [ dimension ] [ Directed Acyclic Graphs ] [ Dirichlet form ] [ Discrete Optimization ] [ discretization error ] [ disentangled representation learning ] [ Disentangled representation learning ] [ Disentanglement ] [ distance ] [ Distillation ] [ distinct elements ] [ Distributed ] [ distributed deep learning ] [ distributed inference ] [ Distributed learning ] [ distributed machine learning ] [ Distributed ML ] [ Distributed Optimization ] [ distributional robust optimization ] [ distribution estimation ] [ distribution shift ] [ diverse strategies ] [ diverse video generation ] [ Diversity denoising ] [ Diversity Regularization ] [ DNN ] [ DNN compression ] [ document analysis ] [ document classification ] [ document retrieval ] [ domain adaptation theory ] [ Domain Adaption ] [ Domain Generalization ] [ domain randomization ] [ Domain Translation ] [ double descent ] [ Double Descent ] [ doubly robustness ] [ Doubly-weighted Laplace operator ] [ Dropout ] [ drug discovery ] [ Drug discovery ] [ dst ] [ Dual-mode ASR ] [ Dueling structure ] [ Dynamical Systems ] [ dynamic computation graphs ] [ dynamics ] [ dynamics prediction ] [ dynamic systems ] [ Early classification ] [ Early pruning ] [ early stopping ] [ EBM ] [ Edit ] [ EEG ] [ effective learning rate ] [ Efficiency ] [ Efficient Attention Mechanism ] [ efficient deep learning ] [ Efficient Deep Learning ] [ Efficient Deep Learning Inference ] [ Efficient ensembles ] [ efficient inference ] [ efficient inference methods ] [ Efficient Inference Methods ] [ EfficientNets ] [ efficient network ] [ Efficient Networks ] [ Efficient training ] [ Efficient Training ] [ efficient training and inference. ] [ egocentric ] [ eigendecomposition ] [ Eigenspectrum ] [ ELBO ] [ electroencephalography ] [ EM ] [ Embedding Models ] [ Embedding Size ] [ Embodied Agents ] [ embodied vision ] [ emergent behavior ] [ empirical analysis ] [ Empirical Game Theory ] [ empirical investigation ] [ Empirical Investigation ] [ empirical study ] [ empowerment ] [ Encoder layer fusion ] [ end-to-end entity linking ] [ End-to-End Object Detection ] [ Energy ] [ Energy-Based GANs ] [ energy based model ] [ energy-based model ] [ Energy-based model ] [ energy based models ] [ Energy-based Models ] [ Energy Based Models ] [ Energy-Based Models ] [ Energy Score ] [ ensemble ] [ Ensemble ] [ ensemble learning ] [ ensembles ] [ Ensembles ] [ entity disambiguation ] [ entity linking ] [ entity retrieval ] [ entropic algorithms ] [ Entropy Maximization ] [ Entropy Model ] [ entropy regularization ] [ epidemiology ] [ episode-level pretext task ] [ episodic training ] [ equilibrium ] [ equivariant ] [ equivariant neural network ] [ ERP ] [ Evaluation ] [ evaluation of interpretability ] [ Event localization ] [ evolution ] [ Evolutionary algorithm ] [ Evolutionary Algorithm ] [ Evolutionary Algorithms ] [ Excess risk ] [ experience replay buffer ] [ experimental evaluation ] [ Expert Models ] [ Explainability ] [ explainable ] [ Explainable AI ] [ Explainable Model ] [ explaining decision-making ] [ explanation method ] [ explanations ] [ Explanations ] [ Exploration ] [ Exponential Families ] [ exponential tilting ] [ exposition ] [ external memory ] [ Extrapolation ] [ extremal sector ] [ facial recognition ] [ factor analysis ] [ factored MDP ] [ Factored MDP ] [ fairness ] [ Fairness ] [ faithfulness ] [ fast DNN inference ] [ fast learning rate ] [ fast-mapping ] [ fast weights ] [ FAVOR ] [ Feature Attribution ] [ feature propagation ] [ features ] [ feature visualization ] [ Feature Visualization ] [ Federated learning ] [ Federated Learning ] [ Few Shot ] [ few-shot concept learning ] [ few-shot domain generalization ] [ Few-shot learning ] [ Few Shot Learning ] [ fine-tuning ] [ finetuning ] [ Fine-tuning ] [ Finetuning ] [ fine-tuning stability ] [ Fingerprinting ] [ First-order Methods ] [ first-order optimization ] [ fisher ratio ] [ flat minima ] [ Flexibility ] [ flow graphs ] [ Fluid Dynamics ] [ Follow-the-Regularized-Leader ] [ Formal Verification ] [ forward mode ] [ Fourier Features ] [ Fourier transform ] [ framework ] [ Frobenius norm ] [ from-scratch ] [ frontend ] [ fruit fly ] [ fully-connected ] [ Fully-Connected Networks ] [ future frame generation ] [ future link prediction ] [ fuzzy tiling activation function ] [ Game Decomposition ] [ Game Theory ] [ GAN ] [ GAN compression ] [ GANs ] [ Garbled Circuits ] [ Gaussian Copula ] [ Gaussian Graphical Model ] [ Gaussian Isoperimetric Inequality ] [ Gaussian mixture model ] [ Gaussian process ] [ Gaussian Process ] [ Gaussian Processes ] [ gaussian process priors ] [ GBDT ] [ generalisation ] [ Generalization ] [ Generalization Bounds ] [ generalization error ] [ Generalization Measure ] [ Generalization of Reinforcement Learning ] [ generalized ] [ generalized Girsanov theorem ] [ Generalized PageRank ] [ Generalized zero-shot learning ] [ Generation ] [ Generative Adversarial Network ] [ Generative Adversarial Networks ] [ generative art ] [ Generative Flow ] [ Generative Model ] [ Generative modeling ] [ Generative Modeling ] [ generative modelling ] [ Generative Modelling ] [ Generative models ] [ Generative Models ] [ genetic programming ] [ Geodesic-Aware FC Layer ] [ geometric ] [ Geometric Deep Learning ] [ G-invariance regularization ] [ global ] [ global optima ] [ Global Reference ] [ glue ] [ GNN ] [ GNNs ] [ goal-conditioned reinforcement learning ] [ goal-conditioned RL ] [ goal reaching ] [ gradient ] [ gradient alignment ] [ Gradient Alignment ] [ gradient boosted decision trees ] [ gradient boosting ] [ gradient decomposition ] [ Gradient Descent ] [ gradient descent-ascent ] [ gradient flow ] [ Gradient flow ] [ gradient flows ] [ gradient redundancy ] [ Gradient stability ] [ Grammatical error correction ] [ Granger causality ] [ Graph ] [ graph classification ] [ graph coarsening ] [ Graph Convolutional Network ] [ Graph Convolutional Neural Networks ] [ graph edit distance ] [ Graph Generation ] [ Graph Generative Model ] [ graph-level prediction ] [ graph networks ] [ Graph neural network ] [ Graph Neural Network ] [ Graph neural networks ] [ Graph Neural Networks ] [ Graph pooling ] [ graph representation learning ] [ Graph representation learning ] [ Graph Representation Learning ] [ graph shift operators ] [ graph-structured data ] [ graph structure learning ] [ Greedy Learning ] [ grid cells ] [ grounding ] [ group disparities ] [ group equivariance ] [ Group Equivariance ] [ Group Equivariant Convolution ] [ group equivariant self-attention ] [ group equivariant transformers ] [ group sparsity ] [ Group-supervised learning ] [ gumbel-softmax ] [ Hamiltonian systems ] [ hard-label attack ] [ hard negative mining ] [ hard negative sampling ] [ Hardware-Aware Neural Architecture Search ] [ Harmonic Analysis ] [ harmonic distortion analysis ] [ healthcare ] [ Healthcare ] [ heap allocation ] [ Hessian matrix ] [ Heterogeneity ] [ Heterogeneous ] [ heterogeneous data ] [ Heterogeneous data ] [ Heterophily ] [ heteroscedasticity ] [ heuristic search ] [ hidden-parameter mdp ] [ hierarchical contrastive learning ] [ Hierarchical Imitation Learning ] [ Hierarchical Multi-Agent Learning ] [ Hierarchical Networks ] [ Hierarchical Reinforcement Learning ] [ Hierarchy-Aware Classification ] [ high-dimensional asymptotics ] [ high-dimensional statistic ] [ high-resolution video generation ] [ hindsight relabeling ] [ histogram binning ] [ historical color image classification ] [ HMC ] [ homomorphic encryption ] [ Homophily ] [ Hopfield layer ] [ Hopfield networks ] [ Hopfield Networks ] [ human-AI collaboration ] [ human cognition ] [ human-computer interaction ] [ human preferences ] [ human psychophysics ] [ humans in the loop ] [ hybrid systems ] [ Hyperbolic ] [ hyperbolic deep learning ] [ Hyperbolic Geometry ] [ hypercomplex representation learning ] [ hypergradients ] [ Hypernetworks ] [ hyperparameter ] [ Hyperparameter Optimization ] [ Hyper-Parameter Optimization ] [ HYPERPARAMETER OPTIMIZATION ] [ Image Classification ] [ image completion ] [ Image compression ] [ Image Editing ] [ Image Generation ] [ Image manipulation ] [ Image Modeling ] [ ImageNet ] [ image reconstruction ] [ Image segmentation ] [ Image Synthesis ] [ image-to-action learning ] [ Image-to-Image Translation ] [ image translation ] [ image warping ] [ imbalanced learning ] [ Imitation Learning ] [ Impartial Learning ] [ implicit bias ] [ Implicit Bias ] [ Implicit Deep Learning ] [ implicit differentiation ] [ implicit functions ] [ implicit neural representations ] [ Implicit Neural Representations ] [ Implicit Representation ] [ Importance Weighting ] [ impossibility ] [ incoherence ] [ Incompatible Environments ] [ Incremental Tree Transformations ] [ independent component analysis ] [ indirection ] [ Individual mediation effects ] [ Inductive Bias ] [ inductive biases ] [ inductive representation learning ] [ infinitely wide neural network ] [ Infinite-Width Limit ] [ infinite-width networks ] [ influence functions ] [ Influence Functions ] [ Information bottleneck ] [ Information Bottleneck ] [ Information Geometry ] [ information-theoretical probing ] [ Information theory ] [ Information Theory ] [ Initialization ] [ input-adaptive multi-exit neural networks ] [ input convex neural networks ] [ input-convex neural networks ] [ InstaHide ] [ Instance adaptation ] [ instance-based label noise ] [ Instance learning ] [ Instance-wise Learning ] [ Instrumental Variable Regression ] [ integral probability metric ] [ intention ] [ interaction networks ] [ Interactions ] [ interactive fiction ] [ Internet of Things ] [ Interpolation Peak ] [ Interpretability ] [ interpretable latent representation ] [ Interpretable Machine Learning ] [ interpretable policy learning ] [ in-the-wild data ] [ Intrinsically Motivated Reinforcement Learning ] [ Intrinsic Motivation ] [ intrinsic motivations ] [ Intrinsic Reward ] [ Invariance and Equivariance ] [ invariance penalty ] [ invariances ] [ Invariant and equivariant deep networks ] [ Invariant Representations ] [ invariant risk minimization ] [ Invariant subspaces ] [ inverse graphics ] [ Inverse reinforcement learning ] [ Inverse Reinforcement Learning ] [ Inverted Index ] [ irl ] [ IRM ] [ irregularly spaced time series ] [ irregular-observed data modelling ] [ isometric ] [ Isotropy ] [ iterated learning ] [ iterative training ] [ JEM ] [ Johnson-Lindenstrauss Transforms ] [ kernel ] [ Kernel Learning ] [ kernel method ] [ kernel-ridge regression ] [ kernels ] [ keypoint localization ] [ Knowledge distillation ] [ Knowledge Distillation ] [ Knowledge factorization ] [ Knowledge Graph Reasoning ] [ knowledge uncertainty ] [ Kullback-Leibler divergence ] [ Kurdyka-Łojasiewicz geometry ] [ label noise robustness ] [ Label Representation ] [ Label shift ] [ label smoothing ] [ Langevin dynamics ] [ Langevin sampling ] [ Language Grounding ] [ Language Model ] [ Language modeling ] [ Language Modeling ] [ Language Modelling ] [ Language Model Pre-training ] [ language processing ] [ language-specific modeling ] [ Laplace kernel ] [ Large-scale ] [ Large-scale Deep Learning ] [ large scale learning ] [ Large-scale Machine Learning ] [ large-scale pre-trained language models ] [ large-scale training ] [ large vocabularies ] [ Last-iterate Convergence ] [ Latency-aware Neural Architecture Search ] [ Latent Simplex ] [ latent space of GANs ] [ Latent Variable Models ] [ lattices ] [ Layer order ] [ layerwise sparsity ] [ learnable ] [ learned algorithms ] [ Learned compression ] [ learned ISTA ] [ Learning ] [ learning action representations ] [ learning-based ] [ learning dynamics ] [ Learning Dynamics ] [ Learning in Games ] [ learning mechanisms ] [ Learning physical laws ] [ Learning Theory ] [ Learning to Hash ] [ learning to optimize ] [ Learning to Optimize ] [ learning to rank ] [ Learning to Rank ] [ learning to teach ] [ learning with noisy labels ] [ Learning with noisy labels ] [ library ] [ lifelong ] [ Lifelong learning ] [ Lifelong Learning ] [ lifted inference ] [ likelihood-based models ] [ likelihood-free inference ] [ limitations ] [ limited data ] [ linear bandits ] [ Linear Convergence ] [ linear estimator ] [ Linear Regression ] [ linear terms ] [ linformer ] [ Lipschitz constants ] [ Lipschitz constrained networks ] [ Local Explanations ] [ locality sensitive hashing ] [ Locally supervised training ] [ local Rademacher complexity ] [ log-concavity ] [ Logic ] [ Logic Rules ] [ logsignature ] [ Long-Tailed Recognition ] [ long-tail learning ] [ Long-term dependencies ] [ long-term prediction ] [ long-term stability ] [ loss correction ] [ Loss function search ] [ Loss Function Search ] [ lossless source compression ] [ Lottery Ticket ] [ Lottery Ticket Hypothesis ] [ lottery tickets ] [ low-dimensional structure ] [ lower bound ] [ lower bounds ] [ Low-latency ASR ] [ low precision training ] [ low rank ] [ low-rank approximation ] [ low-rank tensors ] [ L-smoothness ] [ LSTM ] [ Lyapunov Chaos ] [ Machine learning ] [ Machine Learning ] [ machine learning for code ] [ Machine Learning for Robotics ] [ Machine Learning (ML) for Programming Languages (PL)/Software Engineering (SE) ] [ machine learning systems ] [ Machine translation ] [ Machine Translation ] [ magnitude-based pruning ] [ Manifold clustering ] [ Manifolds ] [ Many-task ] [ mapping ] [ Markov chain Monte Carlo ] [ Markov Chain Monte Carlo ] [ Markov jump process ] [ Masked Reconstruction ] [ mathematical reasoning ] [ Matrix and Tensor Factorization ] [ matrix completion ] [ matrix decomposition ] [ Matrix Factorization ] [ max-margin ] [ MCMC ] [ MCMC sampling ] [ mean estimation ] [ mean-field dynamics ] [ mean separation ] [ Mechanism Design ] [ medical time series ] [ mel-filterbanks ] [ memorization ] [ Memorization ] [ Memory ] [ memory efficient ] [ memory efficient training ] [ Memory Mapping ] [ memory optimized training ] [ Memory-saving ] [ mesh ] [ Message Passing ] [ Message Passing GNNs ] [ meta-gradients ] [ Meta-learning ] [ Meta Learning ] [ Meta-Learning ] [ Metric Surrogate ] [ minimax optimal rate ] [ Minimax Optimization ] [ minimax risk ] [ Minmax ] [ min-max optimization ] [ mirror-prox ] [ Missing Data Inference ] [ Missing value imputation ] [ Missing Values ] [ misssing data ] [ mixed precision ] [ Mixed Precision ] [ Mixed-precision quantization ] [ mixture density nets ] [ mixture of experts ] [ mixup ] [ Mixup ] [ MixUp ] [ MLaaS ] [ MoCo ] [ Model Attribution ] [ model-based control ] [ model-based learning ] [ Model-based Reinforcement Learning ] [ Model-Based Reinforcement Learning ] [ model-based RL ] [ Model-based RL ] [ Model Biases ] [ Model compression ] [ model extraction ] [ model fairness ] [ Model Inversion ] [ model order reduction ] [ model ownership ] [ model predictive control ] [ model-predictive control ] [ Model Predictive Control ] [ Model privacy ] [ Models for code ] [ models of learning and generalization ] [ Model stealing ] [ Modern Hopfield Network ] [ modern Hopfield networks ] [ modified equation analysis ] [ modular architectures ] [ Modular network ] [ modular networks ] [ modular neural networks ] [ modular representations ] [ modulated convolution ] [ Molecular conformation generation ] [ molecular design ] [ Molecular Dynamics ] [ molecular graph generation ] [ Molecular Representation ] [ Molecule Design ] [ Momentum ] [ momentum methods ] [ momentum optimizer ] [ monotonicity ] [ Monte Carlo ] [ Monte-Carlo tree search ] [ Monte Carlo Tree Search ] [ morphology ] [ Morse theory ] [ mpc ] [ Multi-agent ] [ Multi-agent games ] [ Multiagent Learning ] [ multi-agent platform ] [ Multi-Agent Policy Gradients ] [ Multi-agent reinforcement learning ] [ Multi-agent Reinforcement Learning ] [ Multi-Agent Reinforcement Learning ] [ Multi-Agent Transfer Learning ] [ multiclass classification ] [ multi-dimensional discrete action spaces ] [ Multi-domain ] [ multi-domain disentanglement ] [ multi-head attention ] [ Multi-Hop ] [ multi-hop question answering ] [ Multi-hop Reasoning ] [ Multilingual Modeling ] [ multilingual representations ] [ multilingual transformer ] [ multilingual translation ] [ Multimodal ] [ Multi-Modal ] [ Multimodal Attention ] [ multi-modal learning ] [ Multimodal Learning ] [ Multi-Modal Learning ] [ Multimodal Spaces ] [ Multi-objective optimization ] [ multi-player ] [ Multiplicative Weights Update ] [ Multi-scale Representation ] [ multitask ] [ Multi-task ] [ Multi-task Learning ] [ Multi Task Learning ] [ Multi-Task Learning ] [ multi-task learning theory ] [ Multitask Reinforcement Learning ] [ Multi-view Learning ] [ Multi-View Learning ] [ Multi-view Representation Learning ] [ Mutual Information ] [ MuZero ] [ Named Entity Recognition ] [ NAS ] [ nash ] [ natural gradient descent ] [ Natural Language Processing ] [ natural scene statistics ] [ natural sparsity ] [ Negative Sampling ] [ negotiation ] [ nested optimization ] [ network architecture ] [ Network Architecture ] [ Network Inductive Bias ] [ network motif ] [ Network pruning ] [ Network Pruning ] [ networks ] [ network trainability ] [ network width ] [ Neural Architecture Search ] [ Neural Attention Distillation ] [ neural collapse ] [ Neural data compression ] [ Neural IR ] [ neural kernels ] [ neural link prediction ] [ Neural Model Explanation ] [ neural module network ] [ Neural Network ] [ Neural Network Bounding ] [ neural network calibration ] [ Neural Network Gaussian Process ] [ neural network robustness ] [ Neural networks ] [ Neural Networks ] [ neural network training ] [ Neural Network Verification ] [ neural ode ] [ Neural ODE ] [ Neural ODEs ] [ Neural operators ] [ Neural Physics Engines ] [ Neural Processes ] [ neural reconstruction ] [ neural sound synthesis ] [ neural spike train ] [ neural symbolic reasoning ] [ neural tangent kernel ] [ Neural tangent kernel ] [ Neural Tangent Kernel ] [ neural tangent kernels ] [ Neural text decoding ] [ neurobiology ] [ Neuroevolution ] [ Neuro symbolic ] [ Neuro-Symbolic Learning ] [ neuro-symbolic models ] [ NLI ] [ NLP ] [ Node Embeddings ] [ noise contrastive estimation ] [ Noise-contrastive learning ] [ Noise model ] [ noise robust learning ] [ Noisy Demonstrations ] [ noisy label ] [ Noisy Label ] [ Noisy Labels ] [ Non-asymptotic Confidence Intervals ] [ non-autoregressive generation ] [ nonconvex ] [ non-convex learning ] [ Non-Convex Optimization ] [ Non-IID ] [ nonlinear control theory ] [ nonlinear dynamical systems ] [ nonlinear Hawkes process ] [ nonlinear walk ] [ Non-Local Modules ] [ non-minimax optimization ] [ nonnegative PCA ] [ nonseparable Hailtonian system ] [ non-smooth models ] [ non-stationary stochastic processes ] [ no-regret learning ] [ normalized maximum likelihood ] [ normalize layer ] [ normalizers ] [ Normalizing Flow ] [ normalizing flows ] [ Normalizing flows ] [ Normalizing Flows ] [ normative models ] [ novelty-detection ] [ ntk ] [ number of linear regions ] [ numerical errors ] [ numerical linear algebra ] [ object-centric representations ] [ Object detection ] [ Object Detection ] [ object-keypoint representations ] [ ObjectNet ] [ Object Permanence ] [ Observational Imitation ] [ ODE ] [ offline ] [ offline/batch reinforcement learning ] [ off-line reinforcement learning ] [ offline reinforcement learning ] [ Offline Reinforcement Learning ] [ offline RL ] [ off-policy evaluation ] [ Off Policy Evaluation ] [ Off-policy policy evaluation ] [ Off-Policy Reinforcement Learning ] [ off-policy RL ] [ one-class-classification ] [ one-to-many mapping ] [ Open-domain ] [ open domain complex question answering ] [ open source ] [ Optimal Control Theory ] [ optimal convergence ] [ optimal power flow ] [ Optimal Transport ] [ optimal transport maps ] [ Optimisation for Deep Learning ] [ optimism ] [ Optimistic Gradient Descent Ascent ] [ Optimistic Mirror Decent ] [ Optimistic Multiplicative Weights Update ] [ Optimization ] [ order learning ] [ ordinary differential equation ] [ orthogonal ] [ orthogonal layers ] [ orthogonal machine learning ] [ Orthogonal Polynomials ] [ Oscillators ] [ outlier detection ] [ outlier-detection ] [ Outlier detection ] [ out-of-distribution ] [ Out-of-distribution detection in deep learning ] [ out-of-distribution generalization ] [ Out-of-domain ] [ over-fitting ] [ Overfitting ] [ overparameterisation ] [ over-parameterization ] [ Over-parameterization ] [ Overparameterization ] [ overparameterized neural networks ] [ Over-smoothing ] [ Oversmoothing ] [ over-squashing ] [ PAC Bayes ] [ padding ] [ parallel Monte Carlo Tree Search (MCTS) ] [ parallel tempering ] [ Parameter-Reduced MLR ] [ part-based ] [ Partial Amortization ] [ Partial differential equation ] [ partial differential equations ] [ partially observed environments ] [ particle inference ] [ pca ] [ pde ] [ pdes ] [ PDEs ] [ performer ] [ persistence diagrams ] [ personalized learning ] [ perturbation sets ] [ Peter-Weyl Theorem ] [ phase retrieval ] [ Physical parameter estimation ] [ physical reasoning ] [ physical scene understanding ] [ Physical Simulation ] [ physical symbol grounding ] [ physics ] [ physics-guided deep learning ] [ piecewise linear function ] [ pipeline toolkit ] [ plan-based reward shaping ] [ Planning ] [ Poincaré Ball Model ] [ Point cloud ] [ Point clouds ] [ point processes ] [ pointwise mutual information ] [ poisoning ] [ poisoning attack ] [ poisson matrix factorization ] [ policy learning ] [ Policy Optimization ] [ polynomial time ] [ Pose Estimation ] [ Position Embedding ] [ Position Encoding ] [ post-hoc calibration ] [ Post-Hoc Correction ] [ Post Training Quantization ] [ power grid management ] [ Predictive Modeling ] [ predictive uncertainty ] [ Predictive Uncertainty Estimation ] [ pretrained language model ] [ pretrained language model. ] [ pre-trained language model fine-tuning ] [ Pretrained Language Models ] [ Pretrained Text Encoders ] [ pre-training ] [ Pre-training ] [ Primitive Discovery ] [ principal components analysis ] [ Privacy ] [ privacy leakage from gradients ] [ privacy preserving machine learning ] [ Privacy-utility tradeoff ] [ probabelistic models ] [ probabilistic generative models ] [ probabilistic inference ] [ probabilistic matrix factorization ] [ Probabilistic Methods ] [ probabilistic multivariate forecasting ] [ probabilistic numerics ] [ probabilistic programs ] [ probably approximated correct guarantee ] [ Probe ] [ probing ] [ procedural generation ] [ procedural knowledge ] [ product of experts ] [ Product Quantization ] [ Program obfuscation ] [ Program Synthesis ] [ Proper Scoring Rules ] [ protein ] [ prototype propagation ] [ Provable Robustness ] [ provable sample efficiency ] [ proximal gradient descent-ascent ] [ proxy ] [ Pruning ] [ Pruning at initialization ] [ pseudo-labeling ] [ Pseudo-Labeling ] [ QA ] [ Q-learning ] [ Quantization ] [ quantum machine learning ] [ quantum mechanics ] [ Quantum Mechanics ] [ Question Answering ] [ random ] [ Random Feature ] [ Random Features ] [ Randomized Algorithms ] [ Random Matrix Theory ] [ Random Weights Neural Networks ] [ rank-collapse ] [ rank-constrained convex optimization ] [ rao ] [ rao-blackwell ] [ Rate-distortion optimization ] [ raven's progressive matrices ] [ real time recurrent learning ] [ real-world ] [ Real-world image denoising ] [ reasoning paths ] [ recommendation systems ] [ recommender system ] [ Recommender Systems ] [ recovery likelihood ] [ rectified linear unit ] [ Recurrent Generative Model ] [ Recurrent Neural Network ] [ Recurrent neural networks ] [ Recurrent Neural Networks ] [ recursive dense retrieval ] [ reformer ] [ regime agnostic methods ] [ Regression ] [ Regression without correspondence ] [ regret analysis ] [ regret minimization ] [ Regularization ] [ Regularization by denoising ] [ regularized markov decision processes ] [ Reinforcement ] [ Reinforcement learning ] [ Reinforcement Learning ] [ Reinforcement Learnings ] [ Reinforcement learning theory ] [ relabelling ] [ Relational regularized autoencoder ] [ Relation Extraction ] [ relaxed regularization ] [ relu network ] [ ReLU networks ] [ Rematerialization ] [ Render-and-Compare ] [ Reparameterization ] [ repetitions ] [ replica exchange ] [ representational learning ] [ representation analysis ] [ Representation learning ] [ Representation Learning ] [ representation learning for computer vision ] [ representation learning for robotics ] [ representation of dynamical systems ] [ Representation Theory ] [ reproducibility ] [ reproducible research ] [ Reproducing kernel Hilbert space ] [ resampling ] [ reset-free ] [ residual ] [ ResNets ] [ resource constrained ] [ Restricted Boltzmann Machines ] [ retraining ] [ Retrieval ] [ reverse accuracy ] [ reverse engineering ] [ reward learning ] [ reward randomization ] [ reward shaping ] [ reweighting ] [ Rich observation ] [ rich observations ] [ risk-averse ] [ Risk bound ] [ Risk Estimation ] [ risk sensitive ] [ rl ] [ RMSprop ] [ RNA-protein interaction prediction ] [ RNA structure ] [ RNA structure embedding ] [ RNN ] [ RNNs ] [ robotic manipulation ] [ robust ] [ robust control ] [ robust deep learning ] [ Robust Deep Learning ] [ robust learning ] [ Robust Learning ] [ Robust Machine Learning ] [ Robustness ] [ Robustness certificates ] [ Robust Overfitting ] [ ROC ] [ Role-Based Learning ] [ rooted graphs ] [ Rotation invariance ] [ rtrl ] [ Runtime Systems ] [ Saddle-point Optimization ] [ safe ] [ Safe exploration ] [ safe planning ] [ Saliency ] [ Saliency Guided Data Augmentation ] [ saliency maps ] [ SaliencyMix ] [ sample complexity separation ] [ Sample Efficiency ] [ sample information ] [ sample reweighting ] [ Sampling ] [ sampling algorithms ] [ Scalability ] [ Scale ] [ scale-invariant weights ] [ Scale of initialization ] [ scene decomposition ] [ scene generation ] [ Scene Understanding ] [ Science ] [ science of deep learning ] [ score-based generative models ] [ score matching ] [ score-matching ] [ SDE ] [ Second-order analysis ] [ second-order approximation ] [ second-order optimization ] [ Security ] [ segmented models ] [ selective classification ] [ Self-Imitation ] [ self supervised learning ] [ Self-supervised learning ] [ Self-supervised Learning ] [ Self Supervised Learning ] [ Self-Supervised Learning ] [ self-supervision ] [ self-training ] [ self-training theory ] [ semantic anomaly detection ] [ semantic directions in latent space ] [ semantic graphs ] [ Semantic Image Synthesis ] [ semantic parsing ] [ semantic role labeling ] [ semantic-segmentation ] [ Semantic Segmentation ] [ Semantic Textual Similarity ] [ semi-infinite duality ] [ semi-nonnegative matrix factorization ] [ semiparametric inference ] [ semi-supervised ] [ Semi-supervised Learning ] [ Semi-Supervised Learning ] [ semi-supervised learning theory ] [ Sentence Embeddings ] [ Sentence Representations ] [ Sentiment ] [ separation of variables ] [ Sequence Data ] [ Sequence Modeling ] [ sequence models ] [ Sequence-to-sequence learning ] [ sequence-to-sequence models ] [ sequential data ] [ Sequential probability ratio test ] [ Sequential Representation Learning ] [ set prediction ] [ set transformer ] [ SGD ] [ SGD noise ] [ sgld ] [ Shape ] [ shape bias ] [ Shape Bias ] [ Shape Encoding ] [ shapes ] [ Shapley values ] [ Sharpness Minimization ] [ side channel analysis ] [ Sigma Delta Quantization ] [ sign agnostic learning ] [ signal propagation ] [ signature ] [ sim2real ] [ sim2real transfer ] [ simple ] [ Singularity analysis ] [ singular value decomposition ] [ Sinkhorn algorithm ] [ skeleton-based action recognition ] [ sketch-based modeling ] [ sketches ] [ Skill Discovery ] [ SLAM ] [ sliced fused Gromov Wasserstein ] [ Sliced Wasserstein ] [ Slowdown attacks ] [ slowness ] [ Smooth games ] [ smoothing ] [ SMT Solvers ] [ social perception ] [ Soft Body ] [ soft labels ] [ software ] [ sound classification ] [ sound spatialization ] [ Source Code ] [ sparse Bayesian learning ] [ Sparse Embedding ] [ sparse embeddings ] [ sparse reconstruction ] [ sparse representation ] [ sparse representations ] [ sparse stochastic gates ] [ Sparsity ] [ Sparsity Learning ] [ spatial awareness ] [ spatial bias ] [ spatial uncertainty ] [ spatio-temporal forecasting ] [ spatio-temporal graph ] [ spatio-temporal modeling ] [ spatio-temporal modelling ] [ spatiotemporal prediction ] [ Spatiotemporal Understanding ] [ Spectral Analysis ] [ Spectral Distribution ] [ Spectral Graph Filter ] [ spectral regularization ] [ speech generation ] [ speech-impaired ] [ speech processing ] [ speech recognition. ] [ Speech Recognition ] [ spherical distributions ] [ spiking neural network ] [ spurious correlations ] [ square loss vs cross-entropy ] [ stability theory ] [ State abstraction ] [ state abstractions ] [ state-space models ] [ statistical learning theory ] [ Statistical Learning Theory ] [ statistical physics ] [ Statistical Physics ] [ statistical physics methods ] [ Steerable Kernel ] [ Stepsize optimization ] [ stochastic asymptotics ] [ stochastic control ] [ (stochastic) gradient descent ] [ Stochastic Gradient Descent ] [ stochastic gradient Langevin dynamics ] [ stochastic process ] [ Stochastic Processes ] [ stochastic subgradient method ] [ Storage Capacity ] [ straight-through ] [ straightthrough ] [ strategic behavior ] [ Streaming ASR ] [ structural biology ] [ structural credit assignment ] [ structural inductive bias ] [ Structured Pruning ] [ Structure learning ] [ structure prediction ] [ structures prediction ] [ Style Mixing ] [ Style Transfer ] [ subgraph reasoning. ] [ sublinear ] [ submodular optimization ] [ Subspace clustering ] [ Summarization ] [ summary statistics ] [ superpixel ] [ supervised contrastive learning ] [ Supervised Deep Networks ] [ Supervised Learning ] [ support estimation ] [ surprisal ] [ surrogate models ] [ svd ] [ SVD ] [ Symbolic Methods ] [ symbolic regression ] [ symbolic representations ] [ Symmetry ] [ symplectic networks ] [ Syntax ] [ Synthetic benchmark dataset ] [ synthetic-to-real generalization ] [ Systematic generalisation ] [ Systematicity ] [ System identification ] [ Tabular ] [ tabular data ] [ Tabular Data ] [ targeted attack ] [ Task Embeddings ] [ task generation ] [ task-oriented dialogue ] [ Task-oriented Dialogue System ] [ task reduction ] [ Task Segmentation ] [ Teacher-Student Learning ] [ teacher-student model ] [ temporal context ] [ Temporal knowledge graph ] [ temporal networks ] [ tensor product ] [ Text-based Games ] [ Text Representation ] [ Text Retrieval ] [ Text to speech ] [ Text to speech synthesis ] [ text-to-sql ] [ Texture ] [ Texture Bias ] [ Textworld ] [ Theorem proving ] [ theoretical issues in deep learning ] [ theoretical limits ] [ theoretical study ] [ Theory ] [ Theory of deep learning ] [ theory of mind ] [ Third-Person Imitation ] [ Thompson sampling ] [ time-frequency representations ] [ timescale ] [ timescales ] [ Time Series ] [ Time series forecasting ] [ time series prediction ] [ topic modelling ] [ Topology ] [ training dynamics ] [ Training Method ] [ trajectory ] [ trajectory optimization ] [ trajectory prediction ] [ Transferability ] [ Transfer learning ] [ Transfer Learning ] [ transformation invariance ] [ Transformer ] [ Transformers ] [ traveling salesperson problem ] [ Tree-structured Data ] [ trembl ] [ tropical function ] [ trust region ] [ two-layer neural network ] [ Uncertainty ] [ uncertainty calibration ] [ Uncertainty estimates ] [ Uncertainty estimation ] [ Uncertainty Machine Learning ] [ understanding ] [ understanding CNNs ] [ Understanding Data Augmentation ] [ understanding decision-making ] [ understanding deep learning ] [ Understanding Deep Learning ] [ understanding neural networks ] [ U-Net ] [ unidirectional ] [ uniprot ] [ universal approximation ] [ Universal approximation ] [ Universality ] [ universal representation learning ] [ universal sound separation ] [ unlabeled data ] [ Unlabeled Entity Problem ] [ Unlearnable Examples ] [ unrolled algorithms ] [ Unsupervised denoising ] [ Unsupervised Domain Translation ] [ unsupervised image denoising ] [ Unsupervised learning ] [ Unsupervised Learning ] [ unsupervised learning theory ] [ unsupervised loss ] [ Unsupervised Meta-learning ] [ unsupervised object discovery ] [ Unsupervised reinforcement learning ] [ unsupervised skill discovery ] [ unsupervised stabilization ] [ Upper Confidence bound applied to Trees (UCT) ] [ Usable Information ] [ VAE ] [ Value factorization ] [ value learning ] [ vanishing gradient problem ] [ variable binding ] [ variable convergence ] [ Variable Embeddings ] [ Variance Networks ] [ Variational Auto-encoder ] [ Variational autoencoders ] [ Variational Autoencoders ] [ Variational inference ] [ variational information bottleneck ] [ Verification ] [ video analysis ] [ Video Classification ] [ Video Compression ] [ video generation ] [ video-grounded dialogues ] [ Video prediction ] [ Video Reasoning ] [ video recognition ] [ Video Recognition ] [ video representation learning ] [ video synthesis ] [ video-text learning ] [ views ] [ virtual environment ] [ vision-and-language-navigation ] [ visual counting ] [ visualization ] [ visual perception ] [ Visual Reasoning ] [ visual reinforcement learning ] [ visual representation learning ] [ visual saliency ] [ vocoder ] [ voice conversion ] [ Volume Analysis ] [ VQA ] [ vulnerability of RL ] [ wanet ] [ warping functions ] [ Wasserstein ] [ wasserstein-2 barycenters ] [ wasserstein-2 distance ] [ Wasserstein distance ] [ waveform generation ] [ weakly-supervised learning ] [ weakly supervised representation learning ] [ Weak supervision ] [ Weak-supervision ] [ webly-supervised learning ] [ weight attack ] [ weight balance ] [ Weight quantization ] [ weight-sharing ] [ wide local minima ] [ Wigner-Eckart Theorem ] [ winning tickets ] [ wireframe model ] [ word-learning ] [ world models ] [ World Models ] [ worst-case generalisation ] [ xai ] [ XAI ] [ zero-order optimization ] [ zero-shot learning ] [ Zero-shot learning ] [ Zero-shot Learning ] [ Zero-shot synthesis ]

216 Results

Poster
Mon 1:00 Temporally-Extended ε-Greedy Exploration
Will Dabney, Georg Ostrovski, Andre Barreto
Poster
Mon 1:00 Does enhanced shape bias improve neural network robustness to common corruptions?
Chaithanya Kumar Mummadi, Ranjitha Subramaniam, Robin Hutmacher, Julien Vitay, Volker Fischer, Jan Hendrik Metzen
Poster
Mon 1:00 Rethinking the Role of Gradient-based Attribution Methods for Model Interpretability
Suraj Srinivas, François Fleuret
Poster
Mon 1:00 Deciphering and Optimizing Multi-Task Learning: a Random Matrix Approach
Malik Tiomoko, Hafiz Tiomoko Ali, Romain Couillet
Poster
Mon 1:00 A Unified Approach to Interpreting and Boosting Adversarial Transferability
Xin Wang, Jie Ren, Shuyun Lin, Xiangming Zhu, Yisen Wang, Quanshi Zhang
Poster
Mon 1:00 Gauge Equivariant Mesh CNNs: Anisotropic convolutions on geometric graphs
Pim De Haan, Maurice Weiler, Taco Cohen, Max Welling
Poster
Mon 1:00 Interpreting and Boosting Dropout from a Game-Theoretic View
Hao Zhang, Sen Li, YinChao Ma, Mingjie Li, Yichen Xie, Quanshi Zhang
Poster
Mon 1:00 On the Universality of the Double Descent Peak in Ridgeless Regression
David Holzmüller
Poster
Mon 1:00 What Makes Instance Discrimination Good for Transfer Learning?
Nanxuan Zhao, Zhirong Wu, Rynson W Lau, Stephen Lin
Poster
Mon 1:00 Neural Jump Ordinary Differential Equations: Consistent Continuous-Time Prediction and Filtering
Calypso Herrera, Florian Krach, Josef Teichmann
Poster
Mon 1:00 ResNet After All: Neural ODEs and Their Numerical Solution
Katharina Ott, Prateek Katiyar, Philipp Hennig, Michael Tiemann
Spotlight
Mon 3:30 Deciphering and Optimizing Multi-Task Learning: a Random Matrix Approach
Malik Tiomoko, Hafiz Tiomoko Ali, Romain Couillet
Oral
Mon 5:30 Rethinking the Role of Gradient-based Attribution Methods for Model Interpretability
Suraj Srinivas, François Fleuret
Poster
Mon 9:00 WrapNet: Neural Net Inference with Ultra-Low-Precision Arithmetic
Renkun Ni, Hong-Min Chu, Oscar Castaneda, Ping-yeh Chiang, Christoph Studer, Tom Goldstein
Poster
Mon 9:00 Selectivity considered harmful: evaluating the causal impact of class selectivity in DNNs
Matthew Leavitt, Ari Morcos
Poster
Mon 9:00 Learning "What-if" Explanations for Sequential Decision-Making
Ioana Bica, Dan Jarrett, Alihan Hüyük, Mihaela van der Schaar
Poster
Mon 9:00 Representation learning for improved interpretability and classification accuracy of clinical factors from EEG
Garrett Honke, Irina Higgins, Nina Thigpen, Vladimir Miskovic, Katie Link, Sunny Duan, Pramod Gupta, Julia Klawohn, Greg Hajcak
Poster
Mon 9:00 Fast convergence of stochastic subgradient method under interpolation
Huang Fang, Zhenan Fan, Michael Friedlander
Poster
Mon 9:00 Unsupervised Meta-Learning through Latent-Space Interpolation in Generative Models
Siavash Khodadadeh, Sharare Zehtabian, Saeed Vahidian, Weijia Wang, Bill Lin, Ladislau Boloni
Poster
Mon 9:00 The role of Disentanglement in Generalisation
Milton Montero, Casimir JH Ludwig, Rui Ponte Costa, Gaurav Malhotra, Jeffrey Bowers
Poster
Mon 9:00 Using latent space regression to analyze and leverage compositionality in GANs
Lucy Chai, Jonas Wulff, Phillip Isola
Poster
Mon 9:00 On Statistical Bias In Active Learning: How and When to Fix It
Sebastian Farquhar, Yarin Gal, Tom Rainforth
Poster
Mon 9:00 Disentangling 3D Prototypical Networks for Few-Shot Concept Learning
Mihir Prabhudesai, Shamit Lal, Darshan Patil, Hsiao-Yu Tung, Adam Harley, Katerina Fragkiadaki
Poster
Mon 9:00 Shapley Explanation Networks
Rui Wang, Xiaoqian Wang, David Inouye
Poster
Mon 9:00 LambdaNetworks: Modeling long-range Interactions without Attention
Irwan Bello
Poster
Mon 9:00 Sparse encoding for more-interpretable feature-selecting representations in probabilistic matrix factorization
Joshua Chang, Patrick A Fletcher, Jungmin Han, Ted Chang, Shashaank Vattikuti, Bart Desmet, Ayah Zirikly, Carson Chow
Poster
Mon 9:00 Structured Prediction as Translation between Augmented Natural Languages
Giovanni Paolini, Ben Athiwaratkun, Jason Krone, Jie Ma, Alessandro Achille, RISHITA ANUBHAI, Cicero Nogueira dos Santos, Bing Xiang, Stefano Soatto
Poster
Mon 9:00 Overparameterisation and worst-case generalisation: friend or foe?
Aditya Krishna Menon, Ankit Singh Rawat, Sanjiv Kumar
Poster
Mon 9:00 Adaptive Federated Optimization
Sashank Reddi, Zachary Charles, Manzil Zaheer, Zachary Garrett, Keith Rush, Jakub Konečný, Sanjiv Kumar, Brendan McMahan
Poster
Mon 9:00 Multi-Time Attention Networks for Irregularly Sampled Time Series
Satya Narayan Shukla, Benjamin M Marlin
Spotlight
Mon 12:25 Sharpness-aware Minimization for Efficiently Improving Generalization
Pierre Foret, Ariel Kleiner, Hossein Mobahi, Behnam Neyshabur
Spotlight
Mon 12:45 On Statistical Bias In Active Learning: How and When to Fix It
Sebastian Farquhar, Yarin Gal, Tom Rainforth
Invited Talk
Mon 16:00 Commonsense AI: Myth and Truth
Yejin Choi
Poster
Mon 17:00 Layer-adaptive Sparsity for the Magnitude-based Pruning
Jaeho Lee, Sejun Park, Sangwoo Mo, Sungsoo Ahn, Jinwoo Shin
Poster
Mon 17:00 Robust Curriculum Learning: from clean label detection to noisy label self-correction
Tianyi Zhou, Shengjie Wang, Jeff Bilmes
Poster
Mon 17:00 Spatio-Temporal Graph Scattering Transform
Chao Pan, Siheng Chen, Antonio Ortega
Poster
Mon 17:00 Regularization Matters in Policy Optimization - An Empirical Study on Continuous Control
Zhuang Liu, Xuanlin Li, Bingyi Kang, trevor darrell
Poster
Mon 17:00 Benefit of deep learning with non-convex noisy gradient descent: Provable excess risk bound and superiority to kernel methods
Taiji Suzuki, Akiyama Shunta
Poster
Mon 17:00 MixKD: Towards Efficient Distillation of Large-scale Language Models
Kevin Liang, Weituo Hao, Dinghan Shen, Yufan Zhou, Weizhu Chen, Changyou Chen, Lawrence Carin
Poster
Mon 17:00 On Fast Adversarial Robustness Adaptation in Model-Agnostic Meta-Learning
Ren Wang, Kaidi Xu, Sijia Liu, Pin-Yu Chen, Lily Weng, Chuang Gan, Meng Wang
Poster
Mon 17:00 Tilted Empirical Risk Minimization
Tian Li, Ahmad Beirami, Maziar Sanjabi, Virginia Smith
Poster
Mon 17:00 When does preconditioning help or hurt generalization?
Shun-ichi Amari, Jimmy Ba, Roger Grosse, Chen Li, Atsushi Nitanda, Taiji Suzuki, Denny Wu, Ji Xu
Spotlight
Mon 19:45 Structured Prediction as Translation between Augmented Natural Languages
Giovanni Paolini, Ben Athiwaratkun, Jason Krone, Jie Ma, Alessandro Achille, RISHITA ANUBHAI, Cicero Nogueira dos Santos, Bing Xiang, Stefano Soatto
Spotlight
Mon 20:58 HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark
Chaojian Li, Zhongzhi Yu, Yonggan Fu, Yongan Zhang, Yang Zhao, Haoran You, Qixuan Yu, Yue Wang, Cong Hao, Yingyan Lin
Poster
Tue 1:00 Intraclass clustering: an implicit learning ability that regularizes DNNs
Simon Carbonnelle, Christophe De Vleeschouwer
Poster
Tue 1:00 Learning Accurate Entropy Model with Global Reference for Image Compression
Yichen Qian, Zhiyu Tan, Xiuyu Sun, Ming Lin, Dongyang Li, Zhenhong Sun, Li Hao, Rong Jin
Poster
Tue 1:00 Exemplary Natural Images Explain CNN Activations Better than State-of-the-Art Feature Visualization
Judy Borowski, Roland Zimmermann, Judith Schepers, Robert Geirhos, Thomas S Wallis, Matthias Bethge, Wieland Brendel
Poster
Tue 1:00 A Trainable Optimal Transport Embedding for Feature Aggregation and its Relationship to Attention
Grégoire Mialon, Dexiong Chen, Alexandre d'Aspremont, Julien Mairal
Poster
Tue 1:00 Model-based micro-data reinforcement learning: what are the crucial model properties and which model to choose?
Balázs Kégl, Gabriel Hurtado, Albert Thomas
Poster
Tue 1:00 Hyperbolic Neural Networks++
Ryohei Shimizu, YUSUKE Mukuta, Tatsuya Harada
Poster
Tue 1:00 Large-width functional asymptotics for deep Gaussian neural networks
Daniele Bracale, Stefano Favaro, Sandra Fortini, Stefano Peluchetti
Poster
Tue 1:00 PDE-Driven Spatiotemporal Disentanglement
Jérémie DONA, Jean-Yves Franceschi, sylvain lamprier, patrick gallinari
Poster
Tue 1:00 Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering
Yuxuan Zhang, Wenzheng Chen, Huan Ling, Jun Gao, Yinan Zhang, Antonio Torralba, Sanja Fidler
Poster
Tue 1:00 Calibration tests beyond classification
David Widmann, Fredrik Lindsten, Dave Zachariah
Poster
Tue 1:00 Conformation-Guided Molecular Representation with Hamiltonian Neural Networks
Ziyao Li, Shuwen Yang, Guojie Song, Lingsheng Cai
Poster
Tue 1:00 Sample-Efficient Automated Deep Reinforcement Learning
Jörg Franke, Gregor Koehler, André Biedenkapp, Frank Hutter
Poster
Tue 1:00 IDF++: Analyzing and Improving Integer Discrete Flows for Lossless Compression
Rianne van den Berg, Alexey Gritsenko, Mostafa Dehghani, Casper Sønderby, Tim Salimans
Spotlight
Tue 3:25 Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking
Michael Schlichtkrull, Nicola De Cao, Ivan Titov
Spotlight
Tue 3:45 Gauge Equivariant Mesh CNNs: Anisotropic convolutions on geometric graphs
Pim De Haan, Maurice Weiler, Taco Cohen, Max Welling
Spotlight
Tue 5:38 Fidelity-based Deep Adiabatic Scheduling
Eli Ovits, Lior Wolf
Poster
Tue 9:00 DialoGraph: Incorporating Interpretable Strategy-Graph Networks into Negotiation Dialogues
Rishabh Joshi, Vidhisha Balachandran, Shikhar Vashishth, Alan Black, Yulia Tsvetkov
Poster
Tue 9:00 Statistical inference for individual fairness
Subha Maity, Songkai Xue, Mikhail Yurochkin, Yuekai Sun
Poster
Tue 9:00 Nearest Neighbor Machine Translation
Urvashi Khandelwal, Angela Fan, Dan Jurafsky, Luke Zettlemoyer, Mike Lewis
Poster
Tue 9:00 The geometry of integration in text classification RNNs
Kyle Aitken, Vinay Ramasesh, Ankush Garg, Yuan Cao, David Sussillo, Niru Maheswaranathan
Poster
Tue 9:00 Fair Mixup: Fairness via Interpolation
Ching-Yao Chuang, Youssef Mroueh
Poster
Tue 9:00 Interpreting Knowledge Graph Relation Representation from Word Embeddings
Carl Allen, Ivana Balazevic, Timothy Hospedales
Poster
Tue 9:00 Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval
Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul N Bennett, Junaid Ahmed, Arnold Overwijk
Poster
Tue 9:00 Understanding Over-parameterization in Generative Adversarial Networks
Yogesh Balaji, Mohammadmahdi Sajedi, Neha Kalibhat, Mucong Ding, Dominik Stöger, Mahdi Soltanolkotabi, Soheil Feizi
Poster
Tue 9:00 Uncertainty Estimation in Autoregressive Structured Prediction
Andrey Malinin, Mark Gales
Poster
Tue 9:00 Tradeoffs in Data Augmentation: An Empirical Study
Rapha Gontijo Lopes, Sylvia Smullin, Ekin Cubuk, Ethan Dyer
Poster
Tue 9:00 Physics-aware, probabilistic model order reduction with guaranteed stability
Sebastian Kaltenbach, PS Koutsourelakis
Poster
Tue 9:00 Getting a CLUE: A Method for Explaining Uncertainty Estimates
Javier Antorán, Umang Bhatt, Tameem Adel, Adrian Weller, José Miguel Hernández Lobato
Poster
Tue 9:00 Shape or Texture: Understanding Discriminative Features in CNNs
Md Amirul Islam, Matthew Kowal, Patrick Esser, Sen Jia, Björn Ommer, Kosta Derpanis, Neil Bruce
Poster
Tue 9:00 Clairvoyance: A Pipeline Toolkit for Medical Time Series
Dan Jarrett, Jinsung Yoon, Ioana Bica, Zhaozhi Qian, Ari Ercole, Mihaela van der Schaar
Poster
Tue 9:00 Direction Matters: On the Implicit Bias of Stochastic Gradient Descent with Moderate Learning Rate
Jingfeng Wu, Difan Zou, vladimir braverman, Quanquan Gu
Poster
Tue 9:00 On the Origin of Implicit Regularization in Stochastic Gradient Descent
Samuel Smith, Benoit Dherin, David Barrett, Soham De
Poster
Tue 9:00 Sharper Generalization Bounds for Learning with Gradient-dominated Objective Functions
Yunwen Lei, Yiming Ying
Poster
Tue 9:00 Robust Pruning at Initialization
Soufiane Hayou, Jean-Francois Ton, Arnaud Doucet, Yee Whye Teh
Oral
Tue 12:15 Image GANs meet Differentiable Rendering for Inverse Graphics and Interpretable 3D Neural Rendering
Yuxuan Zhang, Wenzheng Chen, Huan Ling, Jun Gao, Yinan Zhang, Antonio Torralba, Sanja Fidler
Expo Talk Panel
Tue 14:00 Interpretability with skeptical and user-centric mind
Been Kim
Poster
Tue 17:00 Deep Equals Shallow for ReLU Networks in Kernel Regimes
Alberto Bietti, Francis Bach
Poster
Tue 17:00 Monotonic Kronecker-Factored Lattice
William Bakst, Nobuyuki Morioka, Erez Louidor
Poster
Tue 17:00 Mirostat: A Neural Text Decoding Algorithm That Directly Controls Perplexity
Sourya Basu, Govardana Sachithanandam Ramachandran, Nitish Shirish Keskar, Lav R Varshney
Poster
Tue 17:00 DDPNOpt: Differential Dynamic Programming Neural Optimizer
Guan-Horng Liu, Tianrong Chen, Evangelos Theodorou
Poster
Tue 17:00 Debiasing Concept-based Explanations with Causal Analysis
Taha Bahadori, David Heckerman
Poster
Tue 17:00 Concept Learners for Few-Shot Learning
Kaidi Cao, Maria Brbic, Jure Leskovec
Poster
Tue 17:00 Denoising Diffusion Implicit Models
Jiaming Song, Chenlin Meng, Stefano Ermon
Poster
Tue 17:00 A unifying view on implicit bias in training linear neural networks
Chulhee (Charlie) Yun, Shankar Krishnan, Hossein Mobahi
Poster
Tue 17:00 RNNLogic: Learning Logic Rules for Reasoning on Knowledge Graphs
Meng Qu, Junkun Chen, Louis-Pascal A Xhonneux, Yoshua Bengio, Jian Tang
Poster
Tue 17:00 Generalized Variational Continual Learning
Noel Loo, Siddharth Swaroop, Rich E Turner
Poster
Tue 17:00 Why Are Convolutional Nets More Sample-Efficient than Fully-Connected Nets?
Zhiyuan Li, Yi Zhang, Sanjeev Arora
Poster
Tue 17:00 RMSprop converges with proper hyper-parameter
Naichen Shi, Dawei Li, Mingyi Hong, Ruoyu Sun
Spotlight
Tue 19:15 DDPNOpt: Differential Dynamic Programming Neural Optimizer
Guan-Horng Liu, Tianrong Chen, Evangelos Theodorou
Spotlight
Tue 20:20 Async-RED: A Provably Convergent Asynchronous Block Parallel Stochastic Method using Deep Denoising Priors
Yu Sun, Jiaming Liu, Yiran Sun, Brendt Wohlberg, Ulugbek Kamilov
Invited Talk
Wed 0:00 Perceiving the 3D World from Images and Video
Lourdes Agapito
Poster
Wed 1:00 Bag of Tricks for Adversarial Training
Tianyu Pang, Xiao Yang, Yinpeng Dong, Hang Su, Jun Zhu
Poster
Wed 1:00 On Data-Augmentation and Consistency-Based Semi-Supervised Learning
Atin Ghosh, alexandre thiery
Poster
Wed 1:00 Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization
Zhenggang Tang, Chao Yu, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon Du, Yu Wang, Yi Wu
Poster
Wed 1:00 Disentangled Recurrent Wasserstein Autoencoder
Jun Han, Martin Min, Ligong Han, Li Erran Li, Xuan Zhang
Poster
Wed 1:00 High-Capacity Expert Binary Networks
Adrian Bulat, Brais Martinez, Georgios Tzimiropoulos
Poster
Wed 1:00 No Cost Likelihood Manipulation at Test Time for Making Better Mistakes in Deep Networks
Shyamgopal Karthik, Ameya Prabhu, Puneet Dokania, Vineet Gandhi
Poster
Wed 1:00 Explainable Deep One-Class Classification
Philipp Liznerski, Lukas Ruff, Robert A Vandermeulen, Billy J Franks, Marius Kloft, Klaus R Muller
Poster
Wed 1:00 Net-DNF: Effective Deep Modeling of Tabular Data
Liran Katzir, Gal Elidan, Ran El-Yaniv
Poster
Wed 1:00 Deep Neural Network Fingerprinting by Conferrable Adversarial Examples
Nils Lukas, Yuxuan Zhang, Florian Kerschbaum
Poster
Wed 1:00 Augmenting Physical Models with Deep Networks for Complex Dynamics Forecasting
Yuan Yin, Vincent Le Guen, Jérémie DONA, Emmanuel d Bezenac, Ibrahim Ayed, Nicolas THOME, patrick gallinari
Poster
Wed 1:00 Reweighting Augmented Samples by Minimizing the Maximal Expected Loss
Mingyang Yi, LU HOU, Lifeng Shang, Xin Jiang, Qun Liu, Zhi-Ming Ma
Poster
Wed 1:00 Interpretable Neural Architecture Search via Bayesian Optimisation with Weisfeiler-Lehman Kernels
Binxin Ru, Xingchen Wan, Xiaowen Dong, Michael Osborne
Poster
Wed 1:00 Fidelity-based Deep Adiabatic Scheduling
Eli Ovits, Lior Wolf
Poster
Wed 1:00 Explaining by Imitating: Understanding Decisions by Interpretable Policy Learning
Alihan Hüyük, Dan Jarrett, Cem Tekin, Mihaela van der Schaar
Poster
Wed 1:00 Learning from Demonstration with Weakly Supervised Disentanglement
Yordan Hristov, Subramanian Ramamoorthy
Oral
Wed 4:05 Getting a CLUE: A Method for Explaining Uncertainty Estimates
Javier Antorán, Umang Bhatt, Tameem Adel, Adrian Weller, José Miguel Hernández Lobato
Spotlight
Wed 4:40 Deep Neural Network Fingerprinting by Conferrable Adversarial Examples
Nils Lukas, Yuxuan Zhang, Florian Kerschbaum
Spotlight
Wed 5:15 Benefit of deep learning with non-convex noisy gradient descent: Provable excess risk bound and superiority to kernel methods
Taiji Suzuki, Akiyama Shunta
Poster
Wed 9:00 Early Stopping in Deep Networks: Double Descent and How to Eliminate it
Reinhard Heckel, Fatih Furkan Yilmaz
Poster
Wed 9:00 Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking
Michael Schlichtkrull, Nicola De Cao, Ivan Titov
Poster
Wed 9:00 Sharpness-aware Minimization for Efficiently Improving Generalization
Pierre Foret, Ariel Kleiner, Hossein Mobahi, Behnam Neyshabur
Poster
Wed 9:00 Are Neural Nets Modular? Inspecting Functional Modularity Through Differentiable Weight Masks
Robert Csordas, Sjoerd van Steenkiste, Jürgen Schmidhuber
Poster
Wed 9:00 Graph Information Bottleneck for Subgraph Recognition
Junchi Yu, Tingyang Xu, Yu Rong, Yatao Bian, Junzhou Huang, Ran He
Poster
Wed 9:00 Few-Shot Bayesian Optimization with Deep Kernel Surrogates
Martin Wistuba, Josif Grabocka
Poster
Wed 9:00 Ask Your Humans: Using Human Instructions to Improve Generalization in Reinforcement Learning
Valerie Chen, Abhinav Gupta, Kenny Marino
Poster
Wed 9:00 TropEx: An Algorithm for Extracting Linear Terms in Deep Neural Networks
Martin Trimmel, Henning Petzka, Cristian Sminchisescu
Poster
Wed 9:00 Anchor & Transform: Learning Sparse Embeddings for Large Vocabularies
Paul Pu Liang, Manzil Zaheer, Yuan Wang, Amr Ahmed
Poster
Wed 9:00 Property Controllable Variational Autoencoder via Invertible Mutual Dependence
Xiaojie Guo, Yuanqi Du, Liang Zhao
Poster
Wed 9:00 Probabilistic Numeric Convolutional Neural Networks
Marc Finzi, Roberto Bondesan, Max Welling
Oral
Wed 11:45 Evolving Reinforcement Learning Algorithms
John Co-Reyes, Yingjie Miao, Daiyi Peng, Esteban Real, Quoc V Le, Sergey Levine, Honglak Lee, Aleksandra Faust
Spotlight
Wed 12:48 LambdaNetworks: Modeling long-range Interactions without Attention
Irwan Bello
Spotlight
Wed 13:38 Dynamic Tensor Rematerialization
Marisa Kirisame, Steven S. Lyubomirsky, Altan Haan, Jennifer Brennan, Mike He, Jared G Roesch, Tianqi Chen, Zachary Tatlock
Poster
Wed 17:00 Evaluations and Methods for Explanation through Robustness Analysis
Cheng-Yu Hsieh, Chih-Kuan Yeh, Xuanqing Liu, Pradeep K Ravikumar, Seungyeon Kim, Sanjiv Kumar, Cho-Jui Hsieh
Poster
Wed 17:00 Economic Hyperparameter Optimization With Blended Search Strategy
Chi Wang, Qingyun Wu, Silu Huang, Amin Saied
Poster
Wed 17:00 AutoLRS: Automatic Learning-Rate Schedule by Bayesian Optimization on the Fly
Yuchen Jin, Tianyi Zhou, Liangyu Zhao, Yibo Zhu, Chuanxiong Guo, Marco Canini, Arvind Krishnamurthy
Poster
Wed 17:00 Evolving Reinforcement Learning Algorithms
John Co-Reyes, Yingjie Miao, Daiyi Peng, Esteban Real, Quoc V Le, Sergey Levine, Honglak Lee, Aleksandra Faust
Poster
Wed 17:00 BERTology Meets Biology: Interpreting Attention in Protein Language Models
Jesse Vig, Ali Madani, Lav R Varshney, Caiming Xiong, Richard Socher, Nazneen Rajani
Poster
Wed 17:00 Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with $1/n$ Parameters
Aston Zhang, Yi Tay, Shuai Zhang, Alvin Chan, Anh Tuan Luu, Siu Hui, Jie Fu
Poster
Wed 17:00 Influence Functions in Deep Learning Are Fragile
Samyadeep Basu, Phil Pope, Soheil Feizi
Poster
Wed 17:00 Learning to Deceive Knowledge Graph Augmented Models via Targeted Perturbation
Mrigank Raman, Aaron Chan, Siddhant Agarwal, PeiFeng Wang, Hansen Wang, Sungchul Kim, Ryan Rossi, Handong Zhao, Nedim Lipka, Xiang Ren
Poster
Wed 17:00 Evaluating the Disentanglement of Deep Generative Models through Manifold Topology
Sharon Zhou, Eric Zelikman, Fred Lu, Andrew Ng, Gunnar E Carlsson, Stefano Ermon
Poster
Wed 17:00 NBDT: Neural-Backed Decision Tree
Alvin Wan, Lisa Dunlap, Daniel Ho, Jihan Yin, Scott Lee, Suzanne Petryk, Sarah A Bargal, Joseph E Gonzalez
Poster
Wed 17:00 CPR: Classifier-Projection Regularization for Continual Learning
Sungmin Cha, Hsiang Hsu, Taebaek Hwang, Flavio Calmon, Taesup Moon
Poster
Wed 17:00 A Geometric Analysis of Deep Generative Image Models and Its Applications
Binxu Wang, Carlos Ponce
Poster
Wed 17:00 Explainable Subgraph Reasoning for Forecasting on Temporal Knowledge Graphs
Zhen Han, Peng Chen, Yunpu Ma, Volker Tresp
Poster
Wed 17:00 Learning Manifold Patch-Based Representations of Man-Made Shapes
Dmitriy Smirnov, Mikhail Bessmeltsev, Justin Solomon
Spotlight
Wed 19:15 GAN "Steerability" without optimization
Nurit Spingarn Eliezer, Ron Banner, Tomer Michaeli
Spotlight
Wed 21:25 Regularization Matters in Policy Optimization - An Empirical Study on Continuous Control
Zhuang Liu, Xuanlin Li, Bingyi Kang, trevor darrell
Oral
Thu 0:30 Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime
Atsushi Nitanda, Taiji Suzuki
Spotlight
Thu 0:45 Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with $1/n$ Parameters
Aston Zhang, Yi Tay, Shuai Zhang, Alvin Chan, Anh Tuan Luu, Siu Hui, Jie Fu
Poster
Thu 1:00 Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime
Atsushi Nitanda, Taiji Suzuki
Poster
Thu 1:00 Adaptive and Generative Zero-Shot Learning
Yu-Ying Chou, Hsuan-Tien (Tien) Lin, Tyng-Luh Liu
Poster
Thu 1:00 Continual learning in recurrent neural networks
Benjamin Ehret, Christian Henning, Maria Cervera, Alexander Meulemans, Johannes von Oswald, Benjamin F Grewe
Poster
Thu 1:00 CausalWorld: A Robotic Manipulation Benchmark for Causal Structure and Transfer Learning
Ossama Ahmed, Frederik Träuble, Anirudh Goyal, Alexander Neitz, Manuel Wuthrich, Yoshua Bengio, Bernhard Schoelkopf, Stefan Bauer
Poster
Thu 1:00 A Diffusion Theory For Deep Learning Dynamics: Stochastic Gradient Descent Exponentially Favors Flat Minima
Zeke Xie, Issei Sato, Masashi Sugiyama
Poster
Thu 1:00 Interpretable Models for Granger Causality Using Self-explaining Neural Networks
Ričards Marcinkevičs, Julia E Vogt
Poster
Thu 1:00 Efficient Inference of Flexible Interaction in Spiking-neuron Networks
Feng Zhou, Yixuan Zhang, Jun Zhu
Poster
Thu 1:00 Counterfactual Generative Networks
Axel Sauer, Andreas Geiger
Poster
Thu 1:00 GAN "Steerability" without optimization
Nurit Spingarn Eliezer, Ron Banner, Tomer Michaeli
Poster
Thu 1:00 ChipNet: Budget-Aware Pruning with Heaviside Continuous Approximations
Rishabh Tiwari, Udbhav Bamba, Arnav Chavan, Deepak Gupta
Oral
Thu 4:20 Augmenting Physical Models with Deep Networks for Complex Dynamics Forecasting
Yuan Yin, Vincent Le Guen, Jérémie DONA, Emmanuel d Bezenac, Ibrahim Ayed, Nicolas THOME, patrick gallinari
Poster
Thu 9:00 Initialization and Regularization of Factorized Neural Layers
Misha Khodak, Neil Tenenholtz, Lester Mackey, Nicolo Fusi
Poster
Thu 9:00 On Position Embeddings in BERT
Wang Benyou, Lifeng Shang, Christina Lioma, Xin Jiang, Hao Yang, Qun Liu, Jakob Simonsen
Poster
Thu 9:00 Meta-learning with negative learning rates
Alberto Bernacchia
Thu 9:00 Bad hypothesis contest
Poster
Thu 9:00 Learning to Set Waypoints for Audio-Visual Navigation
Changan Chen, Sagnik Majumder, Ziad Al-Halah, Ruohan Gao, Santhosh Kumar Ramakrishnan, Kristen Grauman
Poster
Thu 9:00 A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks
Nikunj Saunshi, Sadhika Malladi, Sanjeev Arora
Poster
Thu 9:00 Implicit Gradient Regularization
David Barrett, Benoit Dherin
Poster
Thu 9:00 Dynamic Tensor Rematerialization
Marisa Kirisame, Steven S. Lyubomirsky, Altan Haan, Jennifer Brennan, Mike He, Jared G Roesch, Tianqi Chen, Zachary Tatlock
Poster
Thu 9:00 Deep Networks and the Multiple Manifold Problem
Sam Buchanan, Dar Gilboa, John Wright
Poster
Thu 9:00 A teacher-student framework to distill future trajectories
Alexander Neitz, Giambattista Parascandolo, Bernhard Schoelkopf
Poster
Thu 9:00 BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization
Huanrui Yang, Lin Duan, Yiran Chen, Hai Li
Poster
Thu 9:00 Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
Rafael Valle, Kevin J Shih, Ryan Prenger, Bryan Catanzaro
Poster
Thu 9:00 Learning to Recombine and Resample Data For Compositional Generalization
Ekin Akyürek, Afra Feyza Akyürek, Jacob Andreas
Oral
Thu 11:45 Why Are Convolutional Nets More Sample-Efficient than Fully-Connected Nets?
Zhiyuan Li, Yi Zhang, Sanjeev Arora
Spotlight
Thu 13:50 Disentangled Recurrent Wasserstein Autoencoder
Jun Han, Martin Min, Ligong Han, Li Erran Li, Xuan Zhang
Poster
Thu 17:00 Prototypical Representation Learning for Relation Extraction
Ning Ding, Xiaobin Wang, Yao Fu, Guangwei Xu, Rui Wang, Pengjun Xie, Ying Shen, Fei Huang, Hai-Tao Zheng, Rui Zhang
Poster
Thu 17:00 Multi-timescale Representation Learning in LSTM Language Models
Shivangi Mahto, Vy Vo, Javier Turek, Alexander Huth
Poster
Thu 17:00 Extreme Memorization via Scale of Initialization
Harsh Mehta, Ashok Cutkosky, Behnam Neyshabur
Poster
Thu 17:00 Representing Partial Programs with Blended Abstract Semantics
Maxwell Nye, Yewen Pu, Matthew Bowers, Jacob Andreas, Joshua B Tenenbaum, Armando Solar-Lezama
Poster
Thu 17:00 ANOCE: Analysis of Causal Effects with Multiple Mediators via Constrained Structural Learning
Hengrui Cai, Rui Song, Wenbin Lu
Poster
Thu 17:00 Convex Regularization behind Neural Reconstruction
Arda Sahiner, Morteza Mardani, Batu Ozturkler, Mert Pilanci, John M Pauly
Poster
Thu 17:00 Async-RED: A Provably Convergent Asynchronous Block Parallel Stochastic Method using Deep Denoising Priors
Yu Sun, Jiaming Liu, Yiran Sun, Brendt Wohlberg, Ulugbek Kamilov
Poster
Thu 17:00 In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label Selection Framework for Semi-Supervised Learning
Mamshad Nayeem Rizve, Kevin Duarte, Yogesh S Rawat, Mubarak Shah
Poster
Thu 17:00 BiPointNet: Binary Neural Network for Point Clouds
Haotong Qin, Zhongang Cai, Mingyuan Zhang, Yifu Ding, Haiyu Zhao, Shuai Yi, Xianglong Liu, Hao Su
Poster
Thu 17:00 Learning Energy-Based Generative Models via Coarse-to-Fine Expanding and Sampling
Yang Zhao, Jianwen Xie, Ping Li
Poster
Thu 17:00 The Recurrent Neural Tangent Kernel
Sina Alemohammad, Jack Wang, Randall Balestriero, Richard Baraniuk
Poster
Thu 17:00 Multi-Prize Lottery Ticket Hypothesis: Finding Accurate Binary Neural Networks by Pruning A Randomly Weighted Network
James Diffenderfer, Bhavya Kailkhura
Poster
Thu 17:00 Evaluation of Similarity-based Explanations
Kazuaki Hanawa, Sho Yokoi, Satoshi Hara, Kentaro Inui
Poster
Thu 17:00 HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark
Chaojian Li, Zhongzhi Yu, Yonggan Fu, Yongan Zhang, Yang Zhao, Haoran You, Qixuan Yu, Yue Wang, Cong Hao, Yingyan Lin
Poster
Thu 17:00 Fully Unsupervised Diversity Denoising with Convolutional Variational Autoencoders
Mangal Prakash, Alexander Krull, Florian Jug
Poster
Thu 17:00 Neural Pruning via Growing Regularization
Huan Wang, Can Qin, Yulun Zhang, Yun Fu
Poster
Thu 17:00 Adapting to Reward Progressivity via Spectral Reinforcement Learning
Michael Dann, John Thangarajah
Poster
Thu 17:00 A Learning Theoretic Perspective on Local Explainability
Jeffrey Li, Vaishnavh Nagarajan, Gregory Plumb, Ameet Talwalkar
Spotlight
Thu 19:55 RMSprop converges with proper hyper-parameter
Naichen Shi, Dawei Li, Mingyi Hong, Ruoyu Sun
Workshop
Fri 3:25 Do Input Gradients Highlight Discriminative Features?
Harshay Shah
Workshop
Fri 5:15 Beyond Static Papers: Rethinking How We Share Scientific Understanding in ML
Krishna Murthy Jatavallabhula, Bhairav Mehta, Tegan Maharaj, Amy Tabb, Khimya Khetarpal, Aditya Kusupati, Anna Rogers, Sara Hooker, Breandan Considine, Devi Parikh, Derek Nowrouzezahrai, Yoshua Bengio
Workshop
Fri 6:00 AIMOCC -- AI: Modeling Oceans and Climate Change
Luis Martí, Nayat Sánchez-Pi
Workshop
Fri 6:30 How Can Findings About The Brain Improve AI Systems?
Shinji Nishimoto, Leila Wehbe, Alexander Huth, Javier Turek, Nicole Beckage, Vy Vo, Mariya Toneva, Hsiang-Yun Chien, Shailee Jain, Richard Antonello
Workshop
Fri 6:45 Responsible AI (RAI)
Ahmad Beirami, Emily Black, Krishna Gummadi, Hoda Heidari, Baharan Mirzasoleiman, Meisam Razaviyayn, Joshua Williams
Workshop
Fri 7:00 Interpretable Recommender System With Heterogeneous Information: A Geometric Deep Learning Perspective
Yan Leng
Workshop
Fri 7:05 Model Discovery in the Sparse Sampling Regime
Gert-Jan Both, Georges Tod, Remy Kusters
Workshop
Fri 7:55 ICLR 2021 Workshop on Embodied Multimodal Learning (EML)
Ruohan Gao, Andrew Owens, Dinesh Jayaraman, Yuke Zhu, Jiajun Wu, Kristen Grauman
Workshop
Fri 8:36 Fairly Estimating Socioeconomic Status Under Costly Feature Acquisition
Kush R Varshney
Workshop
Fri 9:01 "Differentially Private Synthetic Data Generations Using Generative Adversarial Networks" by Jinsung Yoon, Google Cloud AI
Jinsung Yoon
Workshop
Fri 10:30 Gal Mishne: Visualizing the PHATE of deep neural networks
Gal Mishne
Workshop
Fri 11:30 Spotlight 6: Lucas Theis|Jonathan Ho, Importance weighted compression
Workshop
Fri 11:52 Poster Spotlight "A multi-objective perspective on tuning hardware and hyperparameters"
David Salinas Salinas
Workshop
Fri 11:54 Poster Spotlight "Simulation-based Scoring for Model-based Asynchronous Hyperparameter and Neural Architecture Search"
Matthias Seeger
Workshop
Fri 11:56 Poster Spotlight "AutoHAS: Efficient Hyperparameter and Architecture Search"
Xuanyi Dong
Workshop
Fri 15:10 On Linear Interpolation in the Latent Space of Deep Generative Models
Mike Yan Michelis, Quentin Becker
Workshop
Regularization Can Help Mitigate Poisoning Attacks... with the Right Hyperparameters
Javier Carnerero-Cano
Workshop
On Privacy and Confidentiality of Communications in Organizational Graphs
Masoumeh Shafieinejad, Huseyin Inan, Marcello Hasegawa, Robert Sim
Workshop
On Lottery Tickets and Minimal Task Representations in Deep Reinforcement Learning
Marc Vischer, Henning Sprekeler, Robert Lange
Workshop
OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation
Jongmin Lee, Wonseok Jeon, Byung-Jun Lee, Joelle Pineau, Kee-Eung Kim
Workshop
Extracting Hyperparameter Constraints From Code
Ingkarat Rak-amnouykit
Workshop
High-Robustness, Low-Transferability Fingerprinting of Neural Networks
Siyue Wang
Workshop
Federated Learning's Blessing: FedAvg has Linear Speedup
Zhaonan Qu, Kaixiang Lin, Zhaojian Li, Jiayu Zhou, Zhengyuan Zhou
Workshop
Distributed Gaussian Differential Privacy Via Shuffling
Kan Chen, Qi Long
Workshop
UNDERSTANDING CLIPPED FEDAVG: CONVERGENCE AND CLIENT-LEVEL DIFFERENTIAL PRIVACY
Xinwei Zhang, Xiangyi Chen, Jinfeng Yi, Steven Wu, Mingyi Hong
Workshop
Multi-Task Reinforcement Learning with Context-based Representations
Shagun Sodhani, Amy Zhang, Joelle Pineau