Downloads 2022
Number of events: 1123
- $\beta$-Intact-VAE: Identifying and Estimating Causal Effects under Limited Overlap
- $\mathrm{SO}(2)$-Equivariant Reinforcement Learning
- $\pi$BO: Augmenting Acquisition Functions with User Beliefs for Bayesian Optimization
- 3rd Workshop on practical ML for Developing Countries: learning under limited/low resource scenarios
- 8-bit Optimizers via Block-wise Quantization
- Ab-Initio Potential Energy Surfaces by Pairing GNNs with Neural Wave Functions
- A Biologically Interpretable Graph Convolutional Network to Link Genetic Risk Pathways and Imaging Phenotypes of Disease
- Accelerated Policy Learning with Parallel Differentiable Simulation
- Accelerating AI Systems: Let the Data Flow!
- Acceleration of Federated Learning with Alleviated Forgetting in Local Training
- A Class of Short-term Recurrence Anderson Mixing Methods and Their Applications
- A Comparison of Hamming Errors of Representative Variable Selection Methods
- A Conditional Point Diffusion-Refinement Paradigm for 3D Point Cloud Completion
- Active Hierarchical Exploration with Stable Subgoal Representation Learning
- Actor-critic is implicitly biased towards high entropy optimal policies
- Actor-Critic Policy Optimization in a Large-Scale Imperfect-Information Game
- AdaAug: Learning Class- and Instance-adaptive Data Augmentation Policies
- AdaMatch: A Unified Approach to Semi-Supervised Learning and Domain Adaptation
- Ada-NETS: Face Clustering via Adaptive Neighbour Discovery in the Structure Space
- Adaptive Wavelet Transformer Network for 3D Shape Representation Learning
- AdaRL: What, Where, and How to Adapt in Transfer Reinforcement Learning
- ADAVI: Automatic Dual Amortized Variational Inference Applied To Pyramidal Bayesian Models
- A Deep Variational Approach to Clustering Survival Data
- Adversarially Robust Conformal Prediction
- Adversarial Retriever-Ranker for Dense Text Retrieval
- Adversarial Robustness Through the Lens of Causality
- Adversarial Support Alignment
- Adversarial Unlearning of Backdoors via Implicit Hypergradient
- AEVA: Black-box Backdoor Detection Using Adversarial Extreme Value Analysis
- A fast and accurate splitting method for optimal transport: analysis and implementation
- ‘Affordances’ for Machine Learning
- A Fine-Grained Analysis on Distribution Shift
- A Fine-Tuning Approach to Belief State Modeling
- A First-Occupancy Representation for Reinforcement Learning
- AfricaNLP 2022: NLP for African languages
- A General Analysis of Example-Selection for Stochastic Gradient Descent
- A generalization of the randomized singular value decomposition
- A Generalized Weighted Optimization Method for Computational Learning and Inversion
- A global convergence theory for deep ReLU implicit networks via over-parameterization
- AI for Earth and Space Science
- A Johnson-Lindenstrauss Framework for Randomly Initialized CNNs
- Almost Tight L0-norm Certified Robustness of Top-k Predictions against Adversarial Perturbations
- A Loss Curvature Perspective on Training Instabilities of Deep Learning Models
- AlphaZero-based Proof Cost Network to Aid Game Solving
- Amortized Implicit Differentiation for Stochastic Bilevel Optimization
- Amortized Tree Generation for Bottom-up Synthesis Planning and Synthesizable Molecular Design
- An Agnostic Approach to Federated Learning with Class Imbalance
- Analytic-DPM: an Analytic Estimate of the Optimal Reverse Variance in Diffusion Probabilistic Models
- Analyzing and Improving the Optimization Landscape of Noise-Contrastive Estimation
- An Autoregressive Flow Model for 3D Molecular Geometry Generation from Scratch
- Ancestral protein sequence reconstruction using a tree-structured Ornstein-Uhlenbeck variational autoencoder
- A Neural Tangent Kernel Perspective of Infinite Tree Ensembles
- A New Perspective on "How Graph Neural Networks Go Beyond Weisfeiler-Lehman?"
- An Experimental Design Perspective on Model-Based Reinforcement Learning
- An Explanation of In-context Learning as Implicit Bayesian Inference
- An Information Fusion Approach to Learning with Instance-Dependent Label Noise
- Anisotropic Random Feature Regression in High Dimensions
- Anomaly Detection for Tabular Data with Internal Contrastive Learning
- Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy
- A NON-PARAMETRIC REGRESSION VIEWPOINT : GENERALIZATION OF OVERPARAMETRIZED DEEP RELU NETWORK UNDER NOISY OBSERVATIONS
- An Operator Theoretic View On Pruning Deep Neural Networks
- Anti-Concentrated Confidence Bonuses For Scalable Exploration
- Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice
- An Unconstrained Layer-Peeled Perspective on Neural Collapse
- Anytime Dense Prediction with Confidence Adaptivity
- Approximation and Learning with Deep Convolutional Models: a Kernel Perspective
- A Program to Build E(N)-Equivariant Steerable CNNs
- A Reduction-Based Framework for Conservative Bandits and Reinforcement Learning
- A Relational Intervention Approach for Unsupervised Dynamics Generalization in Model-Based Reinforcement Learning
- ARTEMIS: Attention-based Retrieval with Text-Explicit Matching and Implicit Similarity
- AS-MLP: An Axial Shifted MLP Architecture for Vision
- Assessing Generalization of SGD via Disagreement
- Associated Learning: an Alternative to End-to-End Backpropagation that Works on CNN, RNN, and Transformer
- A Statistical Framework for Efficient Out of Distribution Detection in Deep Neural Networks
- Asymmetry Learning for Counterfactually-invariant Classification in OOD Tasks
- A Tale of Two Flows: Cooperative Learning of Langevin Flow and Normalizing Flow Toward Energy-Based Model
- A Theoretical Analysis on Feature Learning in Neural Networks: Emergence from Inputs and Advantage over Fixed Features
- A Theory of Tournament Representations
- Attacking deep networks with surrogate-based adversarial black-box methods is easy
- Attention-based Interpretability with Concept Transformers
- Audio Lottery: Speech Recognition Made Ultra-Lightweight, Noise-Robust, and Transferable
- Augmented Sliced Wasserstein Distances
- A Unified Contrastive Energy-based Model for Understanding the Generative Ability of Adversarial Training
- A Unified Wasserstein Distributional Robustness Framework for Adversarial Training
- Automated Self-Supervised Learning for Graphs
- Automatic Loss Function Search for Predict-Then-Optimize Problems with Strong Ranking Property
- Autonomous Learning of Object-Centric Abstractions for High-Level Planning
- Autonomous Reinforcement Learning: Formalism and Benchmarking
- Autoregressive Diffusion Models
- Autoregressive Quantile Flows for Predictive Uncertainty Estimation
- Auto-scaling Vision Transformers without Training
- Auto-Transfer: Learning to Route Transferable Representations
- Axiomatic Explanations for Visual Search, Retrieval, and Similarity Learning
- A Zest of LIME: Towards Architecture-Independent Model Distances
- Back2Future: Leveraging Backfill Dynamics for Improving Real-time Predictions in Future
- Backdoor Defense via Decoupling the Training Process
- BadPre: Task-agnostic Backdoor Attacks to Pre-trained NLP Foundation Models
- Bag of Instances Aggregation Boosts Self-supervised Distillation
- BAM: Bayes with Adaptive Memory
- Bandit Learning with Joint Effect of Incentivized Sampling, Delayed Sampling Feedback, and Self-Reinforcing User Preferences
- Bayesian Framework for Gradient Leakage
- Bayesian Modeling and Uncertainty Quantification for Learning to Optimize: What, Why, and How
- Bayesian Neural Network Priors Revisited
- BDDM: Bilateral Denoising Diffusion Models for Fast and High-Quality Speech Synthesis
- BEiT: BERT Pre-Training of Image Transformers
- Benchmarking the Spectrum of Agent Capabilities
- Better Supervisory Signals by Observing Learning Paths
- Beyond ImageNet Attack: Towards Crafting Adversarial Examples for Black-box Domains
- Beyond interpretability: developing a language to shape our relationships with AI
- BiBERT: Accurate Fully Binarized BERT
- Bi-linear Value Networks for Multi-goal Reinforcement Learning
- Blaschke Product Neural Networks (BPNN): A Physics-Infused Neural Network for Phase Retrieval of Meromorphic Functions
- Boosted Curriculum Reinforcement Learning
- Boosting Randomized Smoothing with Variance Reduced Classifiers
- Boosting the Certified Robustness of L-infinity Distance Nets
- Bootstrapped Meta-Learning
- Bootstrapping Semantic Segmentation with Regional Contrast
- Bregman Gradient Policy Optimization
- Bridging Recommendation and Marketing via Recurrent Intensity Modeling
- Bridging the Gap: Providing Post-Hoc Symbolic Explanations for Sequential Decision-Making Problems with Inscrutable Representations
- Bundle Networks: Fiber Bundles, Local Trivializations, and a Generative Approach to Exploring Many-to-one Maps
- Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing
- CADDA: Class-wise Automatic Differentiable Data Augmentation for EEG Signals
- Can an Image Classifier Suffice For Action Recognition?
- Capacity of Group-invariant Linear Readouts from Equivariant Representations: How Many Objects can be Linearly Classified Under All Possible Views?
- Capturing Structural Locality in Non-parametric Language Models
- Case-based reasoning for better generalization in textual reinforcement learning
- Causal Contextual Bandits with Targeted Interventions
- CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation
- Certified Robustness for Deep Equilibrium Models via Interval Bound Propagation
- Chaos is a Ladder: A New Theoretical Understanding of Contrastive Learning via Augmentation Overlap
- Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
- Chemical-Reaction-Aware Molecule Representation Learning
- Chunked Autoregressive GAN for Conditional Waveform Synthesis
- Churn Reduction via Distillation
- CKConv: Continuous Kernel Convolution For Sequential Data
- Clean Images are Hard to Reblur: Exploiting the Ill-Posed Inverse Task for Dynamic Scene Deblurring
- CLEVA-Compass: A Continual Learning Evaluation Assessment Compass to Promote Research Transparency and Comparability
- ClimateGAN: Raising Climate Change Awareness by Generating Images of Floods
- Closed-form Sample Probing for Learning Generative Models in Zero-shot Learning
- CoBERL: Contrastive BERT for Reinforcement Learning
- CodeTrek: Flexible Modeling of Code using an Extensible Relational Representation
- Coherence-based Label Propagation over Time Series for Accelerated Active Learning
- Cold Brew: Distilling Graph Node Representations with Incomplete or Missing Neighborhoods
- Collapse by Conditioning: Training Class-conditional GANs with Limited Data
- Communication-Efficient Actor-Critic Methods for Homogeneous Markov Games
- Comparing Distributions by Measuring Differences that Affect Decision Making
- ComPhy: Compositional Physical Reasoning of Objects and Events from Videos
- Complete Verification via Multi-Neuron Relaxation Guided Branch-and-Bound
- Compositional Attention: Disentangling Search and Retrieval
- Compositional Training for End-to-End Deep AUC Maximization
- CoMPS: Continual Meta Policy Search
- Concurrent Adversarial Learning for Large-Batch Training
- Conditional Contrastive Learning with Kernel
- Conditional Image Generation by Conditioning Variational Auto-Encoders
- Conditional Object-Centric Learning from Video
- Conditioning Sequence-to-sequence Networks with Learned Activations
- ConFeSS: A Framework for Single Source Cross-Domain Few-Shot Learning
- Connectome-constrained Latent Variable Model of Whole-Brain Neural Activity
- Consistent Counterfactuals for Deep Models
- Constrained Physical-Statistics Models for Dynamical System Identification and Prediction
- Constrained Policy Optimization via Bayesian World Models
- Constraining Linear-chain CRFs to Regular Languages
- Constructing a Good Behavior Basis for Transfer using Generalized Policy Updates
- Constructing Orthogonal Convolutions in an Explicit Manner
- Contact Points Discovery for Soft-Body Manipulations with Differentiable Physics
- Context-Aware Sparse Deep Coordination Graphs
- Contextualized Scene Imagination for Generative Commonsense Reasoning
- Continual Learning with Filter Atom Swapping
- Continual Learning with Recursive Gradient Optimization
- Continual Normalization: Rethinking Batch Normalization for Online Continual Learning
- Continuously Discovering Novel Strategies via Reward-Switching Policy Optimization
- Continuous-Time Meta-Learning with Forward Mode Differentiation
- Contrastive Clustering to Mine Pseudo Parallel Data for Unsupervised Translation
- Contrastive Fine-grained Class Clustering via Generative Adversarial Networks
- Controlling Directions Orthogonal to a Classifier
- Controlling the Complexity and Lipschitz Constant improves Polynomial Nets
- Convergent and Efficient Deep Q Learning Algorithm
- Convergent Graph Solvers
- Coordination Among Neural Modules Through a Shared Global Workspace
- CoordX: Accelerating Implicit Neural Representation with a Split MLP Architecture
- COPA: Certifying Robust Policies for Offline Reinforcement Learning against Poisoning Attacks
- COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation
- cosFormer: Rethinking Softmax In Attention
- CoST: Contrastive Learning of Disentangled Seasonal-Trend Representations for Time Series Forecasting
- CoSubmitting Summer (CSS) Workshop
- Counterfactual Plans under Distributional Ambiguity
- C-Planning: An Automatic Curriculum for Learning Goal-Reaching Tasks
- Creating Training Sets via Weak Indirect Supervision
- Critical Points in Quantum Generative Models
- CROP: Certifying Robust Policies for Reinforcement Learning through Functional Smoothing
- CrossBeam: Learning to Search in Bottom-Up Program Synthesis
- Cross-Domain Imitation Learning via Optimal Transport
- CrossFormer: A Versatile Vision Transformer Hinging on Cross-scale Attention
- Cross-Lingual Transfer with Class-Weighted Language-Invariant Representations
- CrossMatch: Cross-Classifier Consistency Regularization for Open-Set Single Domain Generalization
- Cross-Trajectory Representation Learning for Zero-Shot Generalization in RL
- CrowdPlay: Crowdsourcing Human Demonstrations for Offline Learning
- Crystal Diffusion Variational Autoencoder for Periodic Material Generation
- Curriculum learning as a tool to uncover learning principles in the brain
- CURVATURE-GUIDED DYNAMIC SCALE NETWORKS FOR MULTI-VIEW STEREO
- CycleMLP: A MLP-like Architecture for Dense Prediction
- DAB-DETR: Dynamic Anchor Boxes are Better Queries for DETR
- DARA: Dynamics-Aware Reward Augmentation in Offline Reinforcement Learning
- Data-Driven Offline Optimization for Architecting Hardware Accelerators
- Data-Efficient Graph Grammar Learning for Molecular Generation
- Data Efficient Language-Supervised Zero-Shot Recognition with Optimal Transport Distillation
- Data Poisoning Won’t Save You From Facial Recognition
- D-CODE: Discovering Closed-form ODEs from Observed Trajectories
- Dealing with Non-Stationarity in MARL via Trust-Region Decomposition
- Decentralized Learning for Overparameterized Problems: A Multi-Agent Kernel Approximation Approach
- Declarative nets that are equilibrium models
- Deconstructing the Inductive Biases of Hamiltonian Neural Networks
- Decoupled Adaptation for Cross-Domain Object Detection
- Deep Attentive Variational Inference
- Deep AutoAugment
- Deep Ensembling with No Overhead for either Training or Testing: The All-Round Blessings of Dynamic Sparsity
- Deep Generative Models for Highly Structured Data
- Deep Learning for Code
- Deep Learning on Graphs for Natural Language Processing
- Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers
- Deep Point Cloud Reconstruction
- Deep ReLU Networks Preserve Expected Length
- Defending Against Image Corruptions Through Adversarial Augmentations
- DEGREE: Decomposition Based Explanation for Graph Neural Networks
- Delaunay Component Analysis for Evaluation of Data Representations
- DemoDICE: Offline Imitation Learning with Supplementary Imperfect Demonstrations
- Demystifying Batch Normalization in ReLU Networks: Equivalent Convex Optimization Models and Implicit Regularization
- Demystifying Limited Adversarial Transferability in Automatic Speech Recognition Systems
- Denoising Likelihood Score Matching for Conditional Score-based Data Generation
- DEPTS: Deep Expansion Learning for Periodic Time Series Forecasting
- DeSKO: Stability-Assured Robust Control with a Deep Stochastic Koopman Operator
- DictFormer: Tiny Transformer with Shared Dictionary
- Differentiable DAG Sampling
- Differentiable Expectation-Maximization for Set Representation Learning
- Differentiable Gradient Sampling for Learning Implicit 3D Scene Reconstructions from a Single Image
- Differentiable Prompt Makes Pre-trained Language Models Better Few-shot Learners
- Differentiable Scaffolding Tree for Molecule Optimization
- Differentially Private Fine-tuning of Language Models
- Differentially Private Fractional Frequency Moments Estimation with Polylogarithmic Space
- DiffSkill: Skill Abstraction from Differentiable Physics for Deformable Object Manipulations with Tools
- Diffusion-Based Voice Conversion with Fast Maximum Likelihood Sampling Scheme
- Direct then Diffuse: Incremental Unsupervised Skill Discovery for State Covering and Goal Reaching
- DISCOVERING AND EXPLAINING THE REPRESENTATION BOTTLENECK OF DNNS
- Discovering Invariant Rationales for Graph Neural Networks
- Discovering Latent Concepts Learned in BERT
- Discovering Nonlinear PDEs from Scarce Data with Physics-encoded Learning
- Discrepancy-Based Active Learning for Domain Adaptation
- Discrete Representations Strengthen Vision Transformer Robustness
- Discriminative Similarity for Data Clustering
- Disentanglement Analysis with Partial Information Decomposition
- DISSECT: Disentangled Simultaneous Explanations via Concept Traversals
- Distilling GANs with Style-Mixed Triplets for X2I Translation with Limited Data
- Distributionally Robust Fair Principal Components via Geodesic Descents
- Distributionally Robust Models with Parametric Likelihood Ratios
- Distributional Reinforcement Learning with Monotonic Splines
- Distribution Compression in Near-Linear Time
- Diurnal or Nocturnal? Federated Learning of Multi-branch Networks from Periodically Shifting Distributions
- DIVA: Dataset Derivative of a Learning Task
- Dive Deeper Into Integral Pose Regression
- Divergence-aware Federated Self-Supervised Learning
- Diverse Client Selection for Federated Learning via Submodular Maximization
- Divisive Feature Normalization Improves Image Recognition Performance in AlexNet
- DKM: Differentiable k-Means Clustering Layer for Neural Network Compression
- Do deep networks transfer invariances across classes?
- Does your graph need a confidence boost? Convergent boosted smoothing on graphs with tabular node features
- Doina Precup
- Domain Adversarial Training: A Game Perspective
- Domino: Discovering Systematic Errors with Cross-Modal Embeddings
- Do Not Escape From the Manifold: Discovering the Local Coordinates on the Latent Space of GANs
- Doubly Adaptive Scaled Algorithm for Machine Learning Using Second-Order Information
- Do Users Benefit From Interpretable Vision? A User Study, Baseline, And Dataset
- Do We Need Anisotropic Graph Neural Networks?
- Do you see what I see? Large-scale learning from multimodal videos
- DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization
- DriPP: Driven Point Processes to Model Stimuli Induced Patterns in M/EEG Signals
- Dropout Q-Functions for Doubly Efficient Reinforcement Learning
- Dual Lottery Ticket Hypothesis
- Dynamics-Aware Comparison of Learned Reward Functions
- Dynamic Token Normalization improves Vision Transformers
- EE-Net: Exploitation-Exploration Neural Networks in Contextual Bandits
- Effective Model Sparsification by Scheduled Grow-and-Prune Methods
- Effect of scale on catastrophic forgetting in neural networks
- Efficient Active Search for Combinatorial Optimization Problems
- Efficient and Differentiable Conformal Prediction with General Function Classes
- Efficient Computation of Deep Nonlinear Infinite-Width Neural Networks that Learn Features
- Efficient Learning of Safe Driving Policy via Human-AI Copilot Optimization
- Efficiently Modeling Long Sequences with Structured State Spaces
- Efficient Neural Causal Discovery without Acyclicity Constraints
- Efficient Self-supervised Vision Transformers for Representation Learning
- Efficient Sharpness-aware Minimization for Improved Training of Neural Networks
- Efficient Split-Mix Federated Learning for On-Demand and In-Situ Customization
- Efficient Token Mixing for Transformers via Adaptive Fourier Neural Operators
- Eigencurve: Optimal Learning Rate Schedule for SGD on Quadratic Objectives with Skewed Hessian Spectrums
- EigenGame Unloaded: When playing games is better than optimizing
- Einops: Clear and Reliable Tensor Manipulations with Einstein-like Notation
- Eliminating Sharp Minima from SGD with Truncated Heavy-tailed Noise
- Embedded-model flows: Combining the inductive biases of model-free deep learning and explicit probabilistic modeling
- Emergent Communication at Scale
- Emergent Communication: New Frontiers
- Enabling Arbitrary Translation Objectives with Adaptive Tree Search
- Encoding Weights of Irregular Sparsity for Fixed-to-Fixed Model Compression
- End-to-End Learning of Probabilistic Hierarchies on Graphs
- Energy-Based Learning for Cooperative Games, with Applications to Valuation Problems in Machine Learning
- Energy-Inspired Molecular Conformation Optimization
- Enhancing Cross-lingual Transfer by Manifold Mixup
- EntQA: Entity Linking as Question Answering
- Entroformer: A Transformer-based Entropy Model for Learned Image Compression
- Environment Predictive Coding for Visual Navigation
- Equivariant and Stable Positional Encoding for More Powerful Graph Neural Networks
- Equivariant Graph Mechanics Networks with Constraints
- Equivariant Self-Supervised Learning: Encouraging Equivariance in Representations
- Equivariant Subgraph Aggregation Networks
- Equivariant Transformers for Neural Network based Molecular Potentials
- Escaping limit cycles: Global convergence for constrained nonconvex-nonconcave minimax problems
- Evading Adversarial Example Detection Defenses with Orthogonal Projected Gradient Descent
- Evaluating Disentanglement of Structured Representations
- Evaluating Distributional Distortion in Neural Language Modeling
- Evaluating Model-Based Planning and Planner Amortization for Continuous Control
- Evaluation Metrics for Graph Generative Models: Problems, Pitfalls, and Practical Solutions
- Evidential Turing Processes
- EViT: Expediting Vision Transformers via Token Reorganizations
- Evolutionary Diversity Optimization with Clustering-based Selection for Reinforcement Learning
- EXACT: Scalable Graph Neural Networks Training via Extreme Activation Compression
- Explainable GNN-Based Models over Knowledge Graphs
- Explaining Point Processes by Learning Interpretable Temporal Logic Rules
- Explanations of Black-Box Models based on Directional Feature Interactions
- Exploiting Class Activation Value for Partial-Label Learning
- Exploring extreme parameter compression for pre-trained language models
- Exploring Memorization in Adversarial Training
- Exploring the Limits of Large Scale Pre-training
- Exposing the Implicit Energy Networks behind Masked Language Models via Metropolis--Hastings
- Expressiveness and Approximation Properties of Graph Neural Networks
- Expressivity of Emergent Languages is a Trade-off between Contextual Complexity and Unpredictability
- ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning
- Extending the WILDS Benchmark for Unsupervised Adaptation
- F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization
- FairCal: Fairness Calibration for Face Verification
- Fairness Guarantees under Demographic Shift
- Fairness in Representation for Multilingual NLP: Insights from Controlled Experiments on Conditional Language Modeling
- Fair Normalizing Flows
- FALCON: Fast Visual Concept Learning by Integrating Images, Linguistic descriptions, and Conceptual Relations
- Fast AdvProp
- Fast Differentiable Matrix Square Root
- Fast Generic Interaction Detection for Model Interpretability and Compression
- Fast Model Editing at Scale
- Fast Regression for Structured Inputs
- FastSHAP: Real-Time Shapley Value Estimation
- Fast topological clustering with Wasserstein distance
- Feature Kernel Distillation
- FedBABU: Toward Enhanced Representation for Federated Image Classification
- FedChain: Chained Algorithms for Near-optimal Communication Cost in Federated Learning
- Federated Learning from Only Unlabeled Data with Class-conditional-sharing Clients
- FedPara: Low-rank Hadamard Product for Communication-Efficient Federated Learning
- Few-Shot Backdoor Attacks on Visual Object Tracking
- Few-shot Learning via Dirichlet Tessellation Ensemble
- FILIP: Fine-grained Interactive Language-Image Pre-Training
- Filling the G_ap_s: Multivariate Time Series Imputation by Graph Neural Networks
- FILM: Following Instructions in Language with Modular Methods
- Filtered-CoPhy: Unsupervised Learning of Counterfactual Physics in Pixel Space
- Finding an Unsupervised Image Segmenter in each of your Deep Generative Models
- Finding Biological Plausibility for Adversarially Robust Features via Metameric Tasks
- Fine-grained Differentiable Physics: A Yarn-level Model for Fabrics
- Finetuned Language Models are Zero-Shot Learners
- Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution
- Finite-Time Convergence and Sample Complexity of Multi-Agent Actor-Critic Reinforcement Learning with Average Reward
- Fixed Neural Network Steganography: Train the images, not the network
- FlexConv: Continuous Kernel Convolutions With Differentiable Kernel Sizes
- Focus on the Common Good: Group Distributional Robustness Follows
- Fooling Explanations in Text Classifiers
- Fortuitous Forgetting in Connectionist Networks
- FP-DETR: Detection Transformer Advanced by Fully Pre-training
- Frame Averaging for Invariant and Equivariant Network Design
- Frequency-aware SGD for Efficient Embedding Learning with Provable Benefits
- From Cells to Societies: Collective Learning Across Scales
- From Intervention to Domain Transportation: A Novel Perspective to Optimize Recommendation
- From Stars to Subgraphs: Uplifting Any GNN with Local Structure Awareness
- Gamification and Multiagent Solutions
- GATSBI: Generative Adversarial Training for Simulation-Based Inference
- Gaussian Mixture Convolution Networks
- GDA-AM: ON THE EFFECTIVENESS OF SOLVING MIN-IMAX OPTIMIZATION VIA ANDERSON MIXING
- GeneDisco: A Benchmark for Experimental Design in Drug Discovery
- Generalisation in Lifelong Reinforcement Learning through Logical Composition
- Generalizable Policy Learning in the Physical World
- Generalization of Neural Combinatorial Solvers Through the Lens of Adversarial Robustness
- Generalization Through the Lens of Leave-One-Out Error
- Generalized Decision Transformer for Offline Hindsight Information Matching
- Generalized Demographic Parity for Group Fairness
- Generalized Kernel Thinning
- Generalized Natural Gradient Flows in Hidden Convex-Concave Games and GANs
- Generalized rectifier wavelet covariance models for texture synthesis
- Generalizing Few-Shot NAS with Gradient Matching
- Generating Videos with Dynamics-aware Implicit Generative Adversarial Networks
- Generative Modeling with Optimal Transport Maps
- Generative Models as a Data Source for Multiview Representation Learning
- Generative Planning for Temporally Coordinated Exploration in Reinforcement Learning
- Generative Principal Component Analysis
- Generative Pseudo-Inverse Memory
- GeoDiff: A Geometric Diffusion Model for Molecular Conformation Generation
- Geometrical and Topological Representation Learning
- Geometric and Physical Quantities improve E(3) Equivariant Message Passing
- Geometric Transformers for Protein Interface Contact Prediction
- Geometry-Consistent Neural Shape Representation with Implicit Displacement Fields
- GiraffeDet: A Heavy-Neck Paradigm for Object Detection
- Givens Coordinate Descent Methods for Rotation Matrix Learning in Trainable Embedding Indexes
- GLASS: GNN with Labeling Tricks for Subgraph Representation Learning
- Global Convergence of Multi-Agent Policy Gradient in Markov Potential Games
- GNN is a Counter? Revisiting GNN for Question Answering
- GNN-LM: Language Modeling based on Global Contexts via GNN
- Goal-Directed Planning via Hindsight Experience Replay
- GPT-Critic: Offline Reinforcement Learning for End-to-End Task-Oriented Dialogue Systems
- Gradient Importance Learning for Incomplete Observations
- Gradient Information Matters in Policy Optimization by Back-propagating through Model
- Gradient Matching for Domain Generalization
- Gradient Step Denoiser for convergent Plug-and-Play
- GradMax: Growing Neural Networks using Gradient Information
- GradSign: Model Performance Inference with Theoretical Insights
- GRAND++: Graph Neural Diffusion with A Source Term
- Granger causal inference on DAGs identifies genomic loci regulating transcription
- Graph-Augmented Normalizing Flows for Anomaly Detection of Multiple Time Series
- Graph Auto-Encoder via Neighborhood Wasserstein Reconstruction
- Graph-based Nearest Neighbor Search in Hyperbolic Spaces
- Graph Condensation for Graph Neural Networks
- Graph-Enhanced Exploration for Goal-oriented Reinforcement Learning
- GraphENS: Neighbor-Aware Ego Network Synthesis for Class-Imbalanced Node Classification
- Graph-Guided Network for Irregularly Sampled Multivariate Time Series
- Graph-less Neural Networks: Teaching Old MLPs New Tricks Via Distillation
- Graph Neural Network Guided Local Search for the Traveling Salesperson Problem
- Graph Neural Networks with Learnable Structural and Positional Representations
- Graphon based Clustering and Testing of Networks: Algorithms and Theory
- Graph-Relational Domain Adaptation
- GreaseLM: Graph REASoning Enhanced Language Models
- GroundedML: Anchoring Machine Learning in Classical Algorithmic Theory
- Group-based Interleaved Pipeline Parallelism for Large-scale DNN Training
- Group equivariant neural posterior estimation
- Half-Inverse Gradients for Physical Deep Learning
- Handling Distribution Shifts on Graphs: An Invariance Perspective
- Heteroscedastic Temporal Variational Autoencoder For Irregularly Sampled Time Series
- Hidden Convexity of Wasserstein GANs: Interpretable Generative Models with Closed-Form Solutions
- Hidden Parameter Recurrent State Space Models For Changing Dynamics Scenarios
- Hierarchical Few-Shot Imitation with Skill Transition Models
- Hierarchical Variational Memory for Few-shot Learning Across Domains
- High Probability Bounds for a Class of Nonconvex Algorithms with AdaGrad Stepsize
- High Probability Generalization Bounds with Fast Rates for Minimax Problems
- Hindsight Foresight Relabeling for Meta-Reinforcement Learning
- Hindsight is 20/20: Leveraging Past Traversals to Aid 3D Perception
- Hindsight: Posterior-guided training of retrievers for improved open-ended generation
- Hot-Refresh Model Upgrades with Regression-Free Compatible Training in Image Retrieval
- How Attentive are Graph Attention Networks?
- How Did the Model Change? Efficiently Assessing Machine Learning API Shifts
- How Does SimSiam Avoid Collapse Without Negative Samples? A Unified Understanding with Self-supervised Contrastive Learning
- How Do Vision Transformers Work?
- How Low Can We Go: Trading Memory for Error in Low-Precision Training
- How many degrees of freedom do we need to train deep networks: a loss landscape perspective
- How Much Can CLIP Benefit Vision-and-Language Tasks?
- How to deal with missing data in supervised deep learning?
- How to Inject Backdoors with Better Consistency: Logit Anchoring on Clean Data
- How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective
- How to Train Your MAML to Excel in Few-Shot Classification
- How unlabeled data improve generalization in self-training? A one-hidden-layer theoretical analysis
- How Well Does Self-Supervised Pre-Training Perform with Streaming Data?
- HTLM: Hyper-Text Pre-Training and Prompting of Language Models
- Huber Additive Models for Non-stationary Time Series Analysis
- HyAR: Addressing Discrete-Continuous Action Reinforcement Learning via Hybrid Action Representation
- Hybrid Local SGD for Federated Learning with Heterogeneous Communications
- Hybrid Memoised Wake-Sleep: Approximate Inference at the Discrete-Continuous Interface
- Hybrid Random Features
- HyperDQN: A Randomized Exploration Method for Deep Reinforcement Learning
- Hyperparameter Tuning with Renyi Differential Privacy
- iFlood: A Stable and Effective Regularizer
- IFR-Explore: Learning Inter-object Functional Relationships in 3D Indoor Scenes
- Igeood: An Information Geometry Approach to Out-of-Distribution Detection
- IGLU: Efficient GCN Training via Lazy Updates
- Illiterate DALL-E Learns to Compose
- iLQR-VAE : control-based learning of input-driven dynamics with applications to neural data
- Image BERT Pre-training with Online Tokenizer
- Imbedding Deep Neural Networks
- Imitation Learning by Reinforcement Learning
- Imitation Learning from Observations under Transition Model Disparity
- Implicit Bias of Adversarial Training for Deep Neural Networks
- Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks
- Implicit Bias of Projected Subgradient Method Gives Provable Robust Recovery of Subspaces of Unknown Codimension
- Improved deterministic l2 robustness on CIFAR-10 and CIFAR-100
- Improving Federated Learning Face Recognition via Privacy-Agnostic Clusters
- Improving Mutual Information Estimation with Annealed and Energy-Based Bounds
- Improving Non-Autoregressive Translation Models Without Distillation
- Improving the Accuracy of Learning Example Weights for Imbalance Classification
- In a Nutshell, the Human Asked for This: Latent Goals for Following Temporal Specifications
- Increasing the Cost of Model Extraction with Calibrated Proof of Work
- Incremental False Negative Detection for Contrastive Learning
- Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking
- Inductive Relation Prediction Using Analogy Subgraph Embeddings
- InfinityGAN: Towards Infinite-Pixel Image Synthesis
- Information Bottleneck: Exact Analysis of (Quantized) Neural Networks
- Information Gain Propagation: a New Way to Graph Active Learning with Soft Labels
- Information Prioritization through Empowerment in Visual Model-based RL
- Information-theoretic Online Memory Selection for Continual Learning
- Interacting Contour Stochastic Gradient Langevin Dynamics
- Interpretable Unsupervised Diversity Denoising and Artefact Removal
- IntSGD: Adaptive Floatless Compression of Stochastic Gradients
- Invariant Causal Representation Learning for Out-of-Distribution Generalization
- Inverse Online Learning: Understanding Non-Stationary and Reactionary Policies
- Is Fairness Only Metric Deep? Evaluating and Addressing Subgroup Gaps in Deep Metric Learning
- Is High Variance Unavoidable in RL? A Case Study in Continuous Control
- Is Homophily a Necessity for Graph Neural Networks?
- Is Importance Weighting Incompatible with Interpolating Classifiers?
- Iterated Reasoning with Mutual Information in Cooperative and Byzantine Decentralized Teaming
- Iterative Refinement Graph Neural Network for Antibody Sequence-Structure Co-design
- It Takes Four to Tango: Multiagent Self Play for Automatic Curriculum Generation
- It Takes Two to Tango: Mixup for Deep Metric Learning
- Joint Shapley values: a measure of joint feature importance
- KL Guided Domain Adaptation
- Knowledge Infused Decoding
- Knowledge Removal in Sampling-based Bayesian Inference
- Know Thyself: Transferable Visual Control Policies Through Robot-Awareness
- Know Your Action Set: Learning Action Relations for Reinforcement Learning
- L0-Sparse Canonical Correlation Analysis
- Label-Efficient Semantic Segmentation with Diffusion Models
- Label Encoding for Regression Networks
- Label Leakage and Protection in Two-party Split Learning
- Language-biased image classification: evaluation based on semantic representations
- Language-driven Semantic Segmentation
- Language model compression with weighted low-rank factorization
- Language modeling via stochastic processes
- Large Language Models Can Be Strong Differentially Private Learners
- Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect
- Large-Scale Representation Learning on Graphs via Bootstrapping
- Latent Image Animator: Learning to Animate Images via Latent Space Navigation
- Latent Variable Sequential Set Transformers for Joint Multi-Agent Motion Prediction
- Learnability Lock: Authorized Learnability Control Through Adversarial Invertible Transformations
- Learnability of convolutional neural networks for infinite dimensional input via mixed and anisotropic smoothness
- Learned Simulators for Turbulence
- Learning 3D Representations of Molecular Chirality with Invariance to Bond Rotations
- Learning Altruistic Behaviours in Reinforcement Learning without External Rewards
- Learning a subspace of policies for online adaptation in Reinforcement Learning
- Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
- Learning-Augmented $k$-means Clustering
- Learning by Directional Gradient Descent
- Learning Causal Models from Conditional Moment Restrictions by Importance Weighting
- Learning Continuous Environment Fields via Implicit Functions
- Learning curves for continual learning in neural networks: Self-knowledge transfer and forgetting
- Learning Curves for Gaussian Process Regression with Power-Law Priors and Targets
- Learning Curves for SGD on Structured Features
- Learning Discrete Structured Variational Auto-Encoder using Natural Evolution Strategies
- Learning Disentangled Representation by Exploiting Pretrained Generative Models: A Contrastive Learning View
- Learning Distributionally Robust Models at Scale via Composite Optimization
- Learning Efficient Image Super-Resolution Networks via Structure-Regularized Pruning
- Learning Efficient Online 3D Bin Packing on Packing Configuration Trees
- Learning Fast, Learning Slow: A General Continual Learning Method based on Complementary Learning System
- Learning Fast Samplers for Diffusion Models by Differentiating Through Sample Quality
- Learning Features with Parameter-Free Layers
- Learning Generalizable Representations for Reinforcement Learning via Adaptive Meta-learner of Behavioral Similarities
- Learning Graphon Mean Field Games and Approximate Nash Equilibria
- LEARNING GUARANTEES FOR GRAPH CONVOLUTIONAL NETWORKS ON THE STOCHASTIC BLOCK MODEL
- Learning Hierarchical Structures with Differentiable Nondeterministic Stacks
- Learning Long-Term Reward Redistribution via Randomized Return Decomposition
- Learning meta-features for AutoML
- Learning more skills through optimistic exploration
- Learning Multimodal VAEs through Mutual Supervision
- Learning Neural Contextual Bandits through Perturbed Rewards
- Learning Object-Oriented Dynamics for Planning from Text
- Learning Optimal Conformal Classifiers
- Learning Prototype-oriented Set Representations for Meta-Learning
- Learning Pruning-Friendly Networks via Frank-Wolfe: One-Shot, Any-Sparsity, And No Retraining
- Learning Representation from Neural Fisher Kernel with Low-rank Approximation
- Learning Scenario Representation for Solving Two-stage Stochastic Integer Programs
- Learning State Representations via Retracing in Reinforcement Learning
- Learning Strides in Convolutional Neural Networks
- Learning Super-Features for Image Retrieval
- Learning Synthetic Environments and Reward Networks for Reinforcement Learning
- Learning Temporally Causal Latent Processes from General Temporal Data
- Learning the Dynamics of Physical Systems from Sparse Observations with Finite Element Networks
- Learning to Annotate Part Segmentation with Gradient Matching
- Learning to Complete Code with Sketches
- Learning to Dequantise with Truncated Flows
- Learning to Downsample for Segmentation of Ultra-High Resolution Images
- Learning to Extend Molecular Scaffolds with Structural Motifs
- Learning to Generalize across Domains on Single Test Samples
- Learning to Guide and to be Guided in the Architect-Builder Problem
- Learning to Map for Active Semantic Goal Navigation
- Learning to Remember Patterns: Pattern Matching Memory Networks for Traffic Forecasting
- Learning to Schedule Learning rate with Graph Neural Networks
- Learning Towards The Largest Margins
- Learning transferable motor skills with hierarchical latent mixture policies
- Learning Transferable Reward for Query Object Localization with Policy Adaptation
- Learning Value Functions from Undirected State-only Experience
- Learning Versatile Neural Architectures by Propagating Network Codes
- Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers
- Learning Weakly-supervised Contrastive Representations
- Learning with Noisy Labels Revisited: A Study Using Real-World Human Annotations
- Learn Locally, Correct Globally: A Distributed Algorithm for Training Graph Neural Networks
- Leveraging AI for Science
- Leveraging Automated Unit Tests for Unsupervised Code Translation
- Leveraging unlabeled data to predict out-of-distribution performance
- LFPT5: A Unified Framework for Lifelong Few-shot Language Learning Based on Prompt Tuning of T5
- LIGS: Learnable Intrinsic-Reward Generation Selection for Multi-Agent Learning
- Likelihood Training of Schrödinger Bridge using Forward-Backward SDEs Theory
- Linking Emergent and Natural Languages via Corpus Transfer
- Lipschitz-constrained Unsupervised Skill Discovery
- Local Feature Swapping for Generalization in Reinforcement Learning
- Long Expressive Memory for Sequence Modeling
- Looking Back on Learned Experiences For Class/task Incremental Learning
- LoRA: Low-Rank Adaptation of Large Language Models
- LORD: Lower-Dimensional Embedding of Log-Signature in Neural Rough Differential Equations
- Lossless Compression with Probabilistic Circuits
- LOSSY COMPRESSION WITH DISTRIBUTION SHIFT AS ENTROPY CONSTRAINED OPTIMAL TRANSPORT
- Low-Budget Active Learning via Wasserstein Distance: An Integer Programming Approach
- Machine Learning for Drug Discovery (MLDD)
- Machine Learning For Elliptic PDEs: Fast Rate Generalization Bound, Neural Scaling Law and Minimax Optimality
- MaGNET: Uniform Sampling from Deep Generative Network Manifolds Without Retraining
- MAML is a Noisy Contrastive Learner in Classification
- Map Induction: Compositional spatial submap learning for efficient exploration in novel environments
- Mapping conditional distributions for domain adaptation under generalized target shift
- Mapping Language Models to Grounded Conceptual Spaces
- Mastering Visual Continuous Control: Improved Data-Augmented Reinforcement Learning
- Maximizing Ensemble Diversity in Deep Reinforcement Learning
- Maximum Entropy RL (Provably) Solves Some Robust RL Problems
- Maximum n-times Coverage for Vaccine Design
- MCMC Should Mix: Learning Energy-Based Model with Neural Transport Latent Space MCMC
- Measuring CLEVRness: Black-box Testing of Visual Reasoning Models
- Measuring the Interpretability of Unsupervised Representations via Quantized Reversed Probing
- Memorizing Transformers
- Memory Augmented Optimizers for Deep Learning
- Memory Replay with Data Compression for Continual Learning
- Mention Memory: incorporating textual knowledge into Transformers through entity mention attention
- Message Passing Neural PDE Solvers
- Meta Discovery: Learning to Discover Novel Classes given Very Limited Data
- Meta-Imitation Learning by Watching Video Demonstrations
- Meta Learning Low Rank Covariance Factors for Energy Based Deterministic Uncertainty
- Meta-Learning with Fewer Tasks through Task Interpolation
- MetaMorph: Learning Universal Controllers with Transformers
- MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts
- MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling
- Mind the Gap: Domain Gap Control for Single Shot Domain Adaptation for Generative Adversarial Networks
- Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and Beyond
- miniF2F: a cross-system benchmark for formal Olympiad-level mathematics
- Minimax Optimality (Probably) Doesn't Imply Distribution Learning for GANs
- Minimax Optimization with Smooth Algorithmic Adversaries
- Mirror Descent Policy Optimization
- Missingness Bias in Model Debugging
- MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer
- Model Agnostic Interpretability for Multiple Instance Learning
- Model-augmented Prioritized Experience Replay
- Model-Based Offline Meta-Reinforcement Learning with Regularization
- Modeling Label Space Interactions in Multi-label Classification using Box Embeddings
- Model Zoo: A Growing Brain That Learns Continually
- Modular Lifelong Reinforcement Learning via Neural Composition
- MonoDistill: Learning Spatial Features for Monocular 3D Object Detection
- Monotonic Differentiable Sorting Networks
- MoReL: Multi-omics Relational Learning
- MT3: Multi-Task Multitrack Music Transcription
- Multi-Agent MDP Homomorphic Networks
- Multi-Critic Actor Learning: Teaching RL Policies to Act with Style
- Multimeasurement Generative Models
- Multi-Mode Deep Matrix and Tensor Factorization
- Multi-objective Optimization by Learning Space Partition
- Multiset-Equivariant Set Prediction with Approximate Implicit Differentiation
- Multi-Stage Episodic Control for Strategic Exploration in Text Games
- Multi-Task Processes
- Multitask Prompted Training Enables Zero-Shot Task Generalization
- NAS-Bench-Suite: NAS Evaluation is (Now) Surprisingly Easy
- NASI: Label- and Data-agnostic Neural Architecture Search at Initialization
- NASPY: Automated Extraction of Automated Machine Learning Models
- NASViT: Neural Architecture Search for Efficient Vision Transformers with Gradient Conflict aware Supernet Training
- Natural Language Descriptions of Deep Features
- Natural Posterior Network: Deep Bayesian Predictive Uncertainty for Exponential Family Distributions
- Near-optimal Offline Reinforcement Learning with Linear Representation: Leveraging Variance Information with Pessimism
- Near-Optimal Reward-Free Exploration for Linear Mixture MDPs with Plug-in Solver
- Network Augmentation for Tiny Deep Learning
- NETWORK INSENSITIVITY TO PARAMETER NOISE VIA PARAMETER ATTACK DURING TRAINING
- NeuPL: Neural Population Learning
- Neural Collapse Under MSE Loss: Proximity to and Dynamics on the Central Path
- Neural Contextual Bandits with Deep Representation and Shallow Exploration
- Neural Deep Equilibrium Solvers
- Neural graphical modelling in continuous-time: consistency guarantees and algorithms
- Neural Link Prediction with Walk Pooling
- Neural Markov Controlled SDE: Stochastic Optimization for Continuous-Time Data
- Neural Methods for Logical Reasoning over Knowledge Graphs
- Neural Models for Output-Space Invariance in Combinatorial Problems
- Neural Network Approximation based on Hausdorff distance of Tropical Zonotopes
- Neural Networks as Kernel Learners: The Silent Alignment Effect
- Neural Parameter Allocation Search
- Neural Processes with Stochastic Attention: Paying more attention to the context dataset
- Neural Program Synthesis with Query
- Neural Relational Inference with Node-Specific Information
- Neural Solvers for Fast and Accurate Numerical Optimal Control
- Neural Spectral Marked Point Processes
- Neural Stochastic Dual Dynamic Programming
- Neural Structured Prediction for Inductive Node Classification
- Neural Variational Dropout Processes
- New Insights on Reducing Abrupt Representation Change in Online Continual Learning
- Node Feature Extraction by Self-Supervised Multi-scale Neighborhood Prediction
- NODE-GAM: Neural Generalized Additive Model for Interpretable Deep Learning
- NodePiece: Compositional and Parameter-Efficient Representations of Large Knowledge Graphs
- Noisy Feature Mixup
- Nonlinear ICA Using Volume-Preserving Transformations
- Non-Linear Operator Approximations for Initial Value Problems
- Non-Parallel Text Style Transfer with Self-Parallel Supervision
- Non-Transferable Learning: A New Approach for Model Ownership Verification and Applicability Authorization
- No One Representation to Rule Them All: Overlapping Features of Training Methods
- No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models
- Normalization of Language Embeddings for Cross-Lingual Alignment
- OBJECT DYNAMICS DISTILLATION FOR SCENE DECOMPOSITION AND REPRESENTATION
- Object Pursuit: Building a Space of Objects via Discriminative Weight Generation
- Objects in Semantic Topology
- Offline Neural Contextual Bandits: Pessimism, Optimization and Generalization
- Offline Reinforcement Learning with Implicit Q-Learning
- Offline Reinforcement Learning with Value-based Episodic Memory
- Omni-Dimensional Dynamic Convolution
- Omni-Scale CNNs: a simple and effective kernel size configuration for time series classification
- On Bridging Generic and Personalized Federated Learning for Image Classification
- On Covariate Shift of Latent Confounders in Imitation and Reinforcement Learning
- On Distributed Adaptive Optimization with Gradient Compression
- One After Another: Learning Incremental Skills for a Changing World
- On Evaluation Metrics for Graph Generative Models
- On feature learning in neural networks with global convergence guarantees
- On Improving Adversarial Transferability of Vision Transformers
- On Incorporating Inductive Biases into VAEs
- Online Ad Hoc Teamwork under Partial Observability
- Online Adversarial Attacks
- Online Continual Learning on Class Incremental Blurry Task Configuration with Anytime Inference
- Online Coreset Selection for Rehearsal-based Continual Learning
- Online Facility Location with Predictions
- Online Hyperparameter Meta-Learning with Hypergradient Distillation
- Online Target Q-learning with Reverse Experience Replay: Efficiently finding the Optimal Policy for Linear MDPs
- On Lottery Tickets and Minimal Task Representations in Deep Reinforcement Learning
- On Non-Random Missing Labels in Semi-Supervised Learning
- On-Policy Model Errors in Reinforcement Learning
- On Predicting Generalization using GANs
- On Redundancy and Diversity in Cell-based Neural Architecture Search
- On Robust Prefix-Tuning for Text Classification
- On the approximation properties of recurrent encoder-decoder architectures
- On the benefits of maximum likelihood estimation for Regression and Forecasting
- On the Certified Robustness for Ensemble Models and Beyond
- On the Connection between Local Attention and Dynamic Depth-wise Convolution
- On the Convergence of Certified Robust Training with Interval Bound Propagation
- On the Convergence of mSGD and AdaGrad for Stochastic Optimization
- On the Convergence of the Monte Carlo Exploring Starts Algorithm for Reinforcement Learning
- On the Existence of Universal Lottery Tickets
- On the Generalization of Models Trained with SGD: Information-Theoretic Bounds and Implications
- On the Importance of Difficulty Calibration in Membership Inference Attacks
- On the Importance of Firth Bias Reduction in Few-Shot Classification
- On the Learning and Learnability of Quasimetrics
- On the Limitations of Multimodal VAEs
- On the Optimal Memorization Power of ReLU Neural Networks
- On the Pitfalls of Analyzing Individual Neurons in Language Models
- On the Pitfalls of Heteroscedastic Uncertainty Estimation with Probabilistic Neural Networks
- On the relation between statistical learning and perceptual distances
- On the Role of Neural Collapse in Transfer Learning
- On the role of population heterogeneity in emergent communication
- On the Uncomputability of Partition Functions in Energy-Based Sequence Models
- OntoProtein: Protein Pretraining With Gene Ontology Embedding
- Open-Set Recognition: A Good Closed-Set Classifier is All You Need
- Open-vocabulary Object Detection via Vision and Language Knowledge Distillation
- Open-World Semi-Supervised Learning
- Optimal ANN-SNN Conversion for High-accuracy and Ultra-low-latency Spiking Neural Networks
- Optimal Representations for Covariate Shift
- Optimal Transport for Causal Discovery
- Optimal Transport for Long-Tailed Recognition with Learnable Cost Matrix
- Optimization and Adaptive Generalization of Three layer Neural Networks
- Optimization inspired Multi-Branch Equilibrium Models
- Optimizer Amalgamation
- Optimizing Neural Networks with Gradient Lexicase Selection
- Orchestrated Value Mapping for Reinforcement Learning
- Out-of-distribution Generalization in the Presence of Nuisance-Induced Spurious Correlations
- Overcoming The Spectral Bias of Neural Value Approximation
- PAC-Bayes Information Bottleneck
- PAC Prediction Sets Under Covariate Shift
- P-Adapters: Robustly Extracting Factual Information from Language Models with Diverse Prompts
- PAIR^2Struct: Privacy, Accountability, Interpretability, Robustness, Reasoning on Structured Data
- Parallel Training of GRU Networks with a Multi-Grid Solver for Long Sequences
- Pareto Policy Adaptation
- Pareto Policy Pool for Model-based Offline Reinforcement Learning
- Pareto Set Learning for Neural Multi-Objective Combinatorial Optimization
- Partial Wasserstein Adversarial Network for Non-rigid Point Set Registration
- Particle Stochastic Dual Coordinate Ascent: Exponential convergent algorithm for mean field neural network optimization
- Patch-Fool: Are Vision Transformers Always Robust Against Adversarial Perturbations?
- Path Auxiliary Proposal for MCMC in Discrete Space
- Path Integral Sampler: A Stochastic Control Approach For Sampling
- PEARL: Data Synthesis via Private Embeddings and Adversarial Reconstruction Learning
- Peek-a-Boo: What (More) is Disguised in a Randomly Weighted Neural Network, and How to Find It Efficiently
- Perceiver IO: A General Architecture for Structured Inputs & Outputs
- PER-ETD: A Polynomially Efficient Emphatic Temporal Difference Learning Method
- Permutation-Based SGD: Is Random Optimal?
- Permutation Compressors for Provably Faster Distributed Nonconvex Optimization
- Pessimistic Bootstrapping for Uncertainty-Driven Offline Reinforcement Learning
- Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage
- Petascale connectomics and beyond
- PF-GNN: Differentiable particle filtering based approximation of universal graph representations
- Phase Collapse in Neural Networks
- Phenomenology of Double Descent in Finite-Width Neural Networks
- PI3NN: Out-of-distribution-aware Prediction Intervals from Three Neural Networks
- PiCO: Contrastive Label Disambiguation for Partial Label Learning
- PipeGCN: Efficient Full-Graph Training of Graph Convolutional Networks with Pipelined Feature Communication
- Pix2seq: A Language Modeling Framework for Object Detection
- Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models
- Planning in Stochastic Environments with a Learned Model
- Plant 'n' Seek: Can You Find the Winning Ticket?
- POETREE: Interpretable Policy Learning with Adaptive Decision Trees
- Poisoning and Backdooring Contrastive Learning
- Policy Gradients Incorporating the Future
- Policy improvement by planning with Gumbel
- Policy Smoothing for Provably Robust Reinforcement Learning
- PolyLoss: A Polynomial Expansion Perspective of Classification Loss Functions
- PoNet: Pooling Network for Efficient Token Mixing in Long Sequences
- Possibility Before Utility: Learning And Using Hierarchical Affordances
- Post hoc Explanations may be Ineffective for Detecting Unknown Spurious Correlation
- Post-Training Detection of Backdoor Attacks for Two-Class and Multi-Attack Scenarios
- Practical Conditional Neural Process Via Tractable Dependent Predictions
- Practical Integration via Separable Bijective Networks
- Predicting Physics in Mesh-reduced Space with Temporal Attention
- Pretrained Language Model in Continual Learning: A Comparative Study
- Pre-training Molecular Graph Representation with 3D Geometry
- Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators
- PriorGrad: Improving Conditional Denoising Diffusion Models with Data-Dependent Adaptive Prior
- Privacy Implications of Shuffling
- Probabilistic Implicit Scene Completion
- Procedural generalization by planning with self-supervised world models
- Programmatic Reinforcement Learning without Oracles
- Progressive Distillation for Fast Sampling of Diffusion Models
- Promoting Saliency From Depth: Deep Unsupervised RGB-D Saliency Detection
- Proof Artifact Co-Training for Theorem Proving with Language Models
- Properties from mechanisms: an equivariance perspective on identifiable representation learning
- Prospect Pruning: Finding Trainable Weights at Initialization using Meta-Gradients
- ProtoRes: Proto-Residual Network for Pose Authoring via Learned Inverse Kinematics
- Prototype memory and attention mechanisms for few shot image generation
- Prototypical Contrastive Predictive Coding
- Provable Adaptation across Multiway Domains via Representation Learning
- Provable Learning-based Algorithm For Sparse Recovery
- Provably convergent quasistatic dynamics for mean-field two-player zero-sum games
- Provably Filtering Exogenous Distractors using Multistep Inverse Dynamics
- Provably Robust Adversarial Examples
- Proving the Lottery Ticket Hypothesis for Convolutional Neural Networks
- PSA-GAN: Progressive Self Attention GANs for Synthetic Time Series
- Pseudo-Labeled Auto-Curriculum Learning for Semi-Supervised Keypoint Localization
- Pseudo Numerical Methods for Diffusion Models on Manifolds
- Pyraformer: Low-Complexity Pyramidal Attention for Long-Range Time Series Modeling and Forecasting
- QDrop: Randomly Dropping Quantization for Extremely Low-bit Post-Training Quantization
- Quadtree Attention for Vision Transformers
- Quantitative Performance Assessment of CNN Units via Topological Entropy Calculation
- QUERY EFFICIENT DECISION BASED SPARSE ATTACKS AGAINST BLACK-BOX DEEP LEARNING MODELS
- Query Embedding on Hyper-Relational Knowledge Graphs
- R4D: Utilizing Reference Objects for Long-Range Distance Estimation
- R5: Rule Discovery with Reinforced and Recurrent Relational Reasoning
- Random matrices in service of ML footprint: ternary random features with no performance loss
- Real-Time Neural Voice Camouflage
- Recursive Disentanglement Network
- Recycling Model Updates in Federated Learning: Are Gradient Subspaces Low-Rank?
- Reducing Excessive Margin to Achieve a Better Accuracy vs. Robustness Trade-off
- RegionViT: Regional-to-Local Attention for Vision Transformers
- Regularized Autoencoders for Isometric Representation Learning
- Reinforcement Learning in Presence of Discrete Markovian Context Evolution
- Reinforcement Learning under a Multi-agent Predictive State Representation Model: Method and Theory
- Reinforcement Learning with Sparse Rewards using Guidance from Offline Demonstration
- Relating transformers to models and neural representations of the hippocampal formation
- Relational Learning with Variational Bayes
- Relational Multi-Task Learning: Modeling Relations between Data and Tasks
- Relational Surrogate Loss Learning
- RelaxLoss: Defending Membership Inference Attacks without Losing Utility
- Reliable Adversarial Distillation with Unreliable Teachers
- RelViT: Concept-guided Vision Transformer for Visual Relational Reasoning
- Representation-Agnostic Shape Fields
- Representational Continuity for Unsupervised Continual Learning
- Representation Learning for Online and Offline RL in Low-rank MDPs
- Representation Learning in the Global South: Societal Considerations-Fairness, Safety and Privacy
- Representing Mixtures of Word Embeddings with Mixtures of Topic Embeddings
- Resolving Training Biases via Influence-based Data Relabeling
- Resonance in Weight Space: Covariate Shift Can Drive Divergence of SGD with Momentum
- Responsible Disclosure of Generative Models Using Scalable Fingerprinting
- Rethinking Adversarial Transferability from a Data Distribution Perspective
- Rethinking Class-Prior Estimation for Positive-Unlabeled Learning
- Rethinking Goal-Conditioned Supervised Learning and Its Connection to Offline RL
- Rethinking Network Design and Local Geometry in Point Cloud: A Simple Residual MLP Framework
- Rethinking Supervised Pre-Training for Better Downstream Transferring
- Retriever: Learning Content-Style Representation as a Token-Level Bipartite Graph
- Reverse Engineering of Imperceptible Adversarial Image Perturbations
- Reversible Instance Normalization for Accurate Time-Series Forecasting against Distribution Shift
- Revisiting Design Choices in Offline Model Based Reinforcement Learning
- Revisiting flow generative models for Out-of-distribution detection
- Revisiting Over-smoothing in BERT from the Perspective of Graph
- Revisit Kernel Pruning with Lottery Regulated Grouped Convolutions
- Reward Uncertainty for Exploration in Preference-based Reinforcement Learning
- RISP: Rendering-Invariant State Predictor with Differentiable Simulation and Rendering for Cross-Domain Parameter Estimation
- Robbing the Fed: Directly Obtaining Private Data in Federated Learning with Modified Models
- Robust and Scalable SDE Learning: A Functional Perspective
- Robust Learning Meets Generative Models: Can Proxy Distributions Improve Adversarial Robustness?
- Robust Unlearnable Examples: Protecting Data Privacy Against Adversarial Learning
- RotoGrad: Gradient Homogenization in Multitask Learning
- RvS: What is Essential for Offline RL via Supervised Learning?
- Safe Neurosymbolic Learning with Differentiable Symbolic Execution
- Salient ImageNet: How to discover spurious features in Deep Learning?
- Sample and Computation Redistribution for Efficient Face Detection
- Sample Efficient Deep Reinforcement Learning via Uncertainty Estimation
- Sample Efficient Stochastic Policy Extragradient Algorithm for Zero-Sum Markov Game
- Sample Selection with Uncertainty of Losses for Learning with Noisy Labels
- Sampling with Mirrored Stein Operators
- Scalable One-Pass Optimisation of High-Dimensional Weight-Update Hyperparameters by Implicit Differentiation
- Scalable Sampling for Nonsymmetric Determinantal Point Processes
- Scale Efficiently: Insights from Pretraining and Finetuning Transformers
- Scale Mixtures of Neural Network Gaussian Processes
- Scaling Laws for Neural Machine Translation
- Scarf: Self-Supervised Contrastive Learning using Random Feature Corruption
- Scattering Networks on the Sphere for Scalable and Rotationally Equivariant Spherical CNNs
- Scene Transformer: A unified architecture for predicting future trajectories of multiple agents
- Score-Based Generative Modeling with Critically-Damped Langevin Diffusion
- SDEdit: Guided Image Synthesis and Editing with Stochastic Differential Equations
- Selective Ensembles for Consistent Predictions
- Self-ensemble Adversarial Training for Improved Robustness
- Self-Joint Supervised Learning
- Self-Supervised Graph Neural Networks for Improved Electroencephalographic Seizure Analysis
- Self-Supervised Inference in State-Space Models
- Self-supervised Learning is More Robust to Dataset Imbalance
- Self-Supervision Enhanced Feature Selection with Correlated Gates
- Semi-relaxed Gromov-Wasserstein divergence and applications on graphs
- Sequence Approximation using Feedforward Spiking Neural Network for Spatiotemporal Learning: Theory and Optimization Methods
- Sequential Reptile: Inter-Task Gradient Alignment for Multilingual Learning
- Setting up ML Evaluation Standards to Accelerate Progress
- SGD Can Converge to Local Maxima
- Shallow and Deep Networks are Near-Optimal Approximators of Korobov Functions
- SHINE: SHaring the INverse Estimate from the forward pass for bi-level optimization and implicit models
- Should I Run Offline Reinforcement Learning or Behavioral Cloning?
- Should We Be Pre-training? An Argument for End-task Aware Training as an Alternative
- Shuffle Private Stochastic Convex Optimization
- Signing the Supermask: Keep, Hide, Invert
- Simple GNN Regularisation for 3D Molecular Property Prediction and Beyond
- SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
- SketchODE: Learning neural sketch representation in continuous time
- Skill-based Meta-Reinforcement Learning
- Socially Responsible Machine Learning
- Solving Inverse Problems in Medical Imaging with Score-Based Generative Models
- SOSP: Efficiently Capturing Global Correlations by Second-Order Structured Pruning
- Sound Adversarial Audio-Visual Navigation
- Sound and Complete Neural Network Repair with Minimality and Locality Guarantees
- Source-Free Adaptation to Measurement Shift via Bottom-Up Feature Restoration
- Space-Time Graph Neural Networks
- Spanning Tree-based Graph Generation for Molecules
- Sparse Attention with Learning to Hash
- Sparse Communication via Mixed Distributions
- Sparse DETR: Efficient End-to-End Object Detection with Learnable Sparsity
- Sparsity Winning Twice: Better Robust Generalization from More Efficient Training
- Spatial Graph Attention and Curiosity-driven Policy for Antiviral Drug Discovery
- SphereFace2: Binary Classification is All You Need for Deep Face Recognition
- Spherical Message Passing for 3D Molecular Graphs
- Spike-inspired rank coding for fast and accurate recurrent neural networks
- SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training
- Spread Spurious Attribute: Improving Worst-group Accuracy with Spurious Attribute Estimation
- Sqrt(d) Dimension Dependence of Langevin Monte Carlo
- SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation
- Stability Regularization for Discrete Representation Learning
- Steerable Partial Differential Operators for Equivariant Neural Networks
- Stein Latent Optimization for Generative Adversarial Networks
- Step-unrolled Denoising Autoencoders for Text Generation
- Stiffness-aware neural network for learning Hamiltonian systems
- Stochastic Training is Not Necessary for Generalization
- Strength of Minibatch Noise in SGD
- Structure-Aware Transformer Policy for Inhomogeneous Multi-Task Reinforcement Learning
- StyleAlign: Analysis and Applications of Aligned StyleGAN Models
- StyleNeRF: A Style-based 3D Aware Generator for High-resolution Image Synthesis
- Subspace Regularizers for Few-Shot Class Incremental Learning
- SUMNAS: Supernet with Unbiased Meta-Features for Neural Architecture Search
- Superclass-Conditional Gaussian Mixture Model For Learning Fine-Grained Embeddings
- Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
- SURF: Semi-supervised Reward Learning with Data Augmentation for Feedback-efficient Preference-based Reinforcement Learning
- Surreal-GAN:Semi-Supervised Representation Learning via GAN for uncovering heterogeneous disease-related imaging patterns
- Surrogate Gap Minimization Improves Sharpness-Aware Training
- Surrogate NAS Benchmarks: Going Beyond the Limited Search Spaces of Tabular NAS Benchmarks
- switch-GLAT: Multilingual Parallel Machine Translation Via Code-Switch Decoder
- Switch to Generalize: Domain-Switch Learning for Cross-Domain Few-Shot Classification
- Symbolic Learning to Optimize: Towards Interpretability and Scalability
- Synchromesh: Reliable Code Generation from Pre-trained Language Models
- Tackling the Generative Learning Trilemma with Denoising Diffusion GANs
- TAda! Temporally-Adaptive Convolutions for Video Understanding
- Taming Sparsely Activated Transformer with Stochastic Experts
- TAMP-S2GCNets: Coupling Time-Aware Multipersistence Knowledge Representation with Spatio-Supra Graph Convolutional Networks for Time-Series Forecasting
- TAPEX: Table Pre-training via Learning a Neural SQL Executor
- Target-Side Input Augmentation for Sequence to Sequence Generation
- Task Affinity with Maximum Bipartite Matching in Few-Shot Learning
- Task-Induced Representation Learning
- Task Relatedness-Based Generalization Bounds for Meta Learning
- Temporal Alignment Prediction for Supervised Representation Learning and Few-Shot Sequence Classification
- Temporal Efficient Training of Spiking Neural Network via Gradient Re-weighting
- The Boltzmann Policy Distribution: Accounting for Systematic Suboptimality in Human Models
- The Close Relationship Between Contrastive Learning and Meta-Learning
- The Convex Geometry of Backpropagation: Neural Network Gradient Flows Converge to Extreme Points of the Dual Convex Program
- The Effects of Invertibility on the Representational Complexity of Encoders in Variational Autoencoders
- The Effects of Reward Misspecification: Mapping and Mitigating Misaligned Models
- The Efficiency Misnomer
- The Evolution of Uncertainty of Learning in Games
- The Geometry of Memoryless Stochastic Policy Optimization in Infinite-Horizon POMDPs
- The Hidden Convex Optimization Landscape of Regularized Two-Layer ReLU Networks: an Exact Characterization of Optimal Solutions
- The Inductive Bias of In-Context Learning: Rethinking Pretraining Example Design
- The Information Geometry of Unsupervised Reinforcement Learning
- The MultiBERTs: BERT Reproductions for Robustness Analysis
- The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization
- The Rich Get Richer: Disparate Impact of Semi-Supervised Learning
- The Role of Permutation Invariance in Linear Mode Connectivity of Neural Networks
- The Role of Pretrained Representations for the OOD Generalization of RL Agents
- The Spectral Bias of Polynomial Neural Networks
- The Three Stages of Learning Dynamics in High-dimensional Kernel Methods
- The Uncanny Similarity of Recurrence and Depth
- The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training
- THOMAS: Trajectory Heatmap Output with learned Multi-Agent Sampling
- Tighter Sparse Approximation Bounds for ReLU Neural Networks
- ToM2C: Target-oriented Multi-agent Communication and Cooperation with Theory of Mind
- Top-label calibration and multiclass-to-binary reductions
- Top-N: Equivariant Set and Graph Generation without Exchangeability
- Topological Experience Replay
- Topological Graph Neural Networks
- Topologically Regularized Data Embeddings
- Toward Efficient Low-Precision Training: Data Format Optimization and Hysteresis Quantization
- Toward Faithful Case-based Reasoning through Learning Prototypes in a Nearest Neighbor-friendly Space.
- Towards a Unified View of Parameter-Efficient Transfer Learning
- Towards Better Understanding and Better Generalization of Low-shot Classification in Histology Images with Contrastive Learning
- Towards Building A Group-based Unsupervised Representation Disentanglement Framework
- Towards Continual Knowledge Learning of Language Models
- Towards Deepening Graph Neural Networks: A GNTK-based Optimization Perspective
- Towards Deployment-Efficient Reinforcement Learning: Lower Bound and Optimality
- Towards Empirical Sandwich Bounds on the Rate-Distortion Function
- Towards Evaluating the Robustness of Neural Networks Learned by Transduction
- Towards General Function Approximation in Zero-Sum Markov Games
- Towards Model Agnostic Federated Learning Using Knowledge Distillation
- Towards Training Billion Parameter Graph Neural Networks for Atomic Simulations
- Towards Understanding Generalization via Decomposing Excess Risk Dynamics
- Towards Understanding the Data Dependency of Mixup-style Training
- Towards Understanding the Robustness Against Evasion Attack on Categorical Data
- TPU-GAN: Learning temporal coherence from dynamic point cloud sequences
- Tracking the risk of a deployed model and detecting harmful distribution shifts
- TRAIL: Near-Optimal Imitation Learning with Suboptimal Data
- Training Data Generating Networks: Shape Reconstruction via Bi-level Optimization
- Training invariances and the low-rank phenomenon: beyond linear networks
- Training Structured Neural Networks Through Manifold Identification and Variance Reduction
- Training Transition Policies via Distribution Matching for Complex Tasks
- Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
- Trans-Encoder: Unsupervised sentence-pair modelling through self- and mutual-distillations
- Transferable Adversarial Attack based on Integrated Gradients
- Transfer RL across Observation Feature Spaces via Model-Based Regularization
- Transform2Act: Learning a Transform-and-Control Policy for Efficient Agent Design
- Transformer-based Transform Coding
- Transformer Embeddings of Irregularly Spaced Events and Their Participants
- Transformers Can Do Bayesian Inference
- Transition to Linearity of Wide Neural Networks is an Emerging Property of Assembling Weak Models
- TRGP: Trust Region Gradient Projection for Continual Learning
- Triangle and Four Cycle Counting with Predictions in Graph Streams
- Trigger Hunting with a Topological Prior for Trojan Detection
- Trivial or Impossible --- dichotomous data difficulty masks model differences (on ImageNet and beyond)
- Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning
- Tuformer: Data-driven Design of Transformers for Improved Generalization or Efficiency
- T-WaveNet: A Tree-Structured Wavelet Neural Network for Time Series Signal Analysis
- Uncertainty Modeling for Out-of-Distribution Generalization
- Understanding and Improving Graph Injection Attack by Promoting Unnoticeability
- Understanding and Leveraging Overparameterization in Recursive Value Estimation
- Understanding and Preventing Capacity Loss in Reinforcement Learning
- Understanding approximate and unrolled dictionary learning for pattern recovery
- Understanding Dimensional Collapse in Contrastive Self-supervised Learning
- Understanding Domain Randomization for Sim-to-real Transfer
- Understanding Intrinsic Robustness Using Label Uncertainty
- Understanding Latent Correlation-Based Multiview Learning and Self-Supervision: An Identifiability Perspective
- Understanding over-squashing and bottlenecks on graphs via curvature
- Understanding the Role of Self Attention for Efficient Speech Recognition
- Understanding the Variance Collapse of SVGD in High Dimensions
- Unified Visual Transformer Compression
- UniFormer: Unified Transformer for Efficient Spatial-Temporal Representation Learning
- Unifying Likelihood-free Inference with Black-box Optimization and Beyond
- Universal Approximation Under Constraints is Possible with Transformers
- Universalizing Weak Supervision
- Unraveling Model-Agnostic Meta-Learning via The Adaptation Learning Rate
- Unrolling PALM for Sparse Semi-Blind Source Separation
- Unsupervised Discovery of Object Radiance Fields
- Unsupervised Disentanglement with Tensor Product Representations on the Torus
- Unsupervised Learning of Full-Waveform Inversion: Connecting CNN and Partial Differential Equation in a Loop
- Unsupervised Semantic Segmentation by Distilling Feature Correspondences
- Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling
- Using Graph Representation Learning with Schema Encoders to Measure the Severity of Depressive Symptoms
- VAE Approximation Error: ELBO and Exponential Families
- Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon Reasoning
- Value Gradient weighted Model-Based Reinforcement Learning
- Variational autoencoders in the presence of low-dimensional data: landscape and implicit bias
- Variational Inference for Discriminative Learning with Generative Modeling of Feature Incompletion
- Variational methods for simulation-based inference
- Variational Neural Cellular Automata
- Variational oracle guiding for reinforcement learning
- Variational Predictive Routing with Nested Subjective Timescales
- VAT-Mart: Learning Visual Action Trajectory Proposals for Manipulating 3D ARTiculated Objects
- VC dimension of partially quantized neural networks in the overparametrized regime
- Vector-quantized Image Modeling with Improved VQGAN
- VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning
- ViDT: An Efficient and Effective Fully Transformer-based Object Detector
- Vision-Based Manipulators Need to Also See from Their Hands
- Visual Correspondence Hallucination
- Visual hyperacuity with moving sensor and recurrent neural computations
- Visual Representation Learning Does Not Generalize Strongly Within the Same Domain
- Visual Representation Learning over Latent Domains
- ViTGAN: Training GANs with Vision Transformers
- Vitruvion: A Generative Model of Parametric CAD Sketches
- VOS: Learning What You Don't Know by Virtual Outlier Synthesis
- W-CTC: a Connectionist Temporal Classification Loss with Wild Cards
- WeakM3D: Towards Weakly Supervised Monocular 3D Object Detection
- Weighted Training for Cross-Task Learning
- What Do We Mean by Generalization in Federated Learning?
- What Happens after SGD Reaches Zero Loss? --A Mathematical Framework
- What Makes Better Augmentation Strategies? Augment Difficult but Not too Different
- What’s Wrong with Deep Learning in Tree Search for Combinatorial Optimization
- When Can We Learn General-Sum Markov Games with a Large Number of Players Sample-Efficiently?
- When should agents explore?
- When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations
- When, Why, and Which Pretrained GANs Are Useful?
- Which Shortcut Cues Will DNNs Choose? A Study from the Parameter-Space Perspective
- Who Is the Strongest Enemy? Towards Optimal and Efficient Evasion Attacks in Deep RL
- Who Is Your Right Mixup Partner in Positive and Unlabeled Learning
- Why Propagate Alone? Parallel Use of Labels and Features on Graphs
- Wiki-M3L: Wikipedia and Multimodal & Multilingual Research
- Wiring Up Vision: Minimizing Supervised Synaptic Updates Needed to Produce a Primate Ventral Stream
- Wisdom of Committees: An Overlooked Approach To Faster and More Accurate Models
- Wish you were here: Hindsight Goal Selection for long-horizon dexterous manipulation
- Workshop on Agent Learning in Open-Endedness
- Workshop on the Elements of Reasoning: Objects, Structure and Causality
- X-model: Improving Data Efficiency in Deep Learning with A Minimax Model
- You are AllSet: A Multiset Function Framework for Hypergraph Neural Networks
- You Mostly Walk Alone: Analyzing Feature Attribution in Trajectory Prediction
- Zero-CL: Instance and Feature decorrelation for negative-free symmetric contrastive learning
- ZeroFL: Efficient On-Device Training for Federated Learning with Local Sparsity
- Zero Pixel Directional Boundary by Vector Transform
- Zero-Shot Self-Supervised Learning for MRI Reconstruction