Skip to yearly menu bar
Skip to main content
Main Navigation
ICLR
Help/FAQ
Contact ICLR
Downloads
ICLR Blog
Code of Conduct
Privacy Policy
Create Profile
Reset Password
Journal To Conference Track
Diversity & Inclusion
Proceedings at OpenReview
Future Meetings
Press
Exhibitor Information
ICLR Twitter
About ICLR
My Stuff
Login
Select Year: (2023)
2025
2024
2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
Getting Started
Schedule
Main Conference
Invited Talks
In-person Orals
Papers
Awards
Workshops
Community
Affinity Events
Socials
Town Hall
Sponsors
Organizers
Help
Website FAQ
Helpdesk
RocketChat Client
Browse
Visualization
mini
compact
topic
detail
Showing papers for
.
×
×
title
author
topic
session
shuffle
by
serendipity
bookmarked first
visited first
not visited first
bookmarked but not visited
Enable Javascript in your browser to see the papers page.
Contrastive Audio-Visual Masked Autoencoder
Fairness-aware Contrastive Learning with Partially Annotated Sensitive Attributes
Approximation and non-parametric estimation of functions over high-dimensional spheres via deep ReLU networks
Statistical Guarantees for Consensus Clustering
Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One Objective
Seeing Differently, Acting Similarly: Heterogeneously Observable Imitation Learning
Humanly Certifying Superhuman Classifiers
Write and Paint: Generative Vision-Language Models are Unified Modal Learners
A law of adversarial risk, interpolation, and label noise
Learning Low Dimensional State Spaces with Overparameterized Recurrent Neural Nets
Unsupervised 3D Object Learning through Neuron Activity aware Plasticity
Amortised Invariance Learning for Contrastive Self-Supervision
Expressive Monotonic Neural Networks
In-sample Actor Critic for Offline Reinforcement Learning
The Implicit Bias of Minima Stability in Multivariate Shallow ReLU Networks
Fake It Until You Make It : Towards Accurate Near-Distribution Novelty Detection
CktGNN: Circuit Graph Neural Network for Electronic Design Automation
Jointly Learning Visual and Auditory Speech Representations from Raw Data
Collaborative Pure Exploration in Kernel Bandit
Beyond Lipschitz: Sharp Generalization and Excess Risk Bounds for Full-Batch GD
DELTA: DEGRADATION-FREE FULLY TEST-TIME ADAPTATION
Priors, Hierarchy, and Information Asymmetry for Skill Transfer in Reinforcement Learning
Provable Sim-to-real Transfer in Continuous Domain with Partial Observations
Diagnosing and Rectifying Vision Models using Language
Provably Efficient Risk-Sensitive Reinforcement Learning: Iterated CVaR and Worst Path
Nearly Minimax Optimal Offline Reinforcement Learning with Linear Function Approximation: Single-Agent MDP and Markov Game
DensePure: Understanding Diffusion Models for Adversarial Robustness
Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling
Learning ReLU networks to high uniform accuracy is intractable
DiffuSeq: Sequence to Sequence Text Generation with Diffusion Models
Budgeted Training for Vision Transformer
Hyperbolic Deep Reinforcement Learning
Interpretable Debiasing of Vectorized Language Representations with Iterative Orthogonalization
Rethinking Symbolic Regression: Morphology and Adaptability in the Context of Evolutionary Algorithms
Sequential Gradient Coding For Straggler Mitigation
How gradient estimator variance and bias impact learning in neural networks
TempCLR: Temporal Alignment Representation with Contrastive Learning
Optimal Conservative Offline RL with General Function Approximation via Augmented Lagrangian
Distilling Model Failures as Directions in Latent Space
Equivariant Hypergraph Diffusion Neural Operators
Learning to Grow Pretrained Models for Efficient Transformer Training
Is Attention All That NeRF Needs?
NeRF-SOS: Any-View Self-supervised Object Segmentation on Complex Scenes
Function-space regularized Rényi divergences
Finding the Global Semantic Representation in GAN through Fréchet Mean
DexDeform: Dexterous Deformable Object Manipulation with Human Demonstrations and Differentiable Physics
Effective passive membership inference attacks in federated learning against overparameterized models
CoRTX: Contrastive Framework for Real-time Explanation
Diminishing Return of Value Expansion Methods in Model-Based Reinforcement Learning
A Unified Approach to Reinforcement Learning, Quantal Response Equilibria, and Two-Player Zero-Sum Games
Forward Super-Resolution: How Can GANs Learn Hierarchical Generative Models for Real-World Distributions
Moderate Coreset: A Universal Method of Data Selection for Real-world Data-efficient Deep Learning
Harnessing Out-Of-Distribution Examples via Augmenting Content and Style
Using Language to Extend to Unseen Domains
A Holistic View of Label Noise Transition Matrix in Deep Learning and Beyond
Softened Symbol Grounding for Neuro-symbolic Systems
Learning Math Reasoning from Self-Sampled Correct and Partially-Correct Solutions
Logical Message Passing Networks with One-hop Inference on Atomic Formulas
Transformers Learn Shortcuts to Automata
Noise-Robust De-Duplication at Scale
Guiding Energy-based Models via Contrastive Latent Variables
Equivariance-aware Architectural Optimization of Neural Networks
Information Plane Analysis for Dropout Neural Networks
UniMax: Fairer and More Effective Language Sampling for Large-Scale Multilingual Pretraining
Unsupervised Meta-learning via Few-shot Pseudo-supervised Contrastive Learning
STUNT: Few-shot Tabular Learning with Self-generated Tasks from Unlabeled Tables
HiT-MDP: Learning the SMDP option framework on MDPs with Hidden Temporal Embeddings
Improved Learning-augmented Algorithms for k-means and k-medians Clustering
Progressive Mix-Up for Few-Shot Supervised Multi-Source Domain Transfer
Does Deep Learning Learn to Abstract? A Systematic Probing Framework
RandProx: Primal-Dual Optimization Algorithms with Randomized Proximal Updates
GOGGLE: Generative Modelling for Tabular Data by Learning Relational Structure
Quantized Compressed Sensing with Score-Based Generative Models
Latent Neural ODEs with Sparse Bayesian Multiple Shooting
$\mathcal{O}$-GNN: incorporating ring priors into molecular modeling
Characterizing the spectrum of the NTK via a power series expansion
Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability
FIFA: Making Fairness More Generalizable in Classifiers Trained on Imbalanced Data
Equivariant Shape-Conditioned Generation of 3D Molecules for Ligand-Based Drug Design
VIPeR: Provably Efficient Algorithm for Offline RL with Neural Function Approximation
$\rm A^2Q$: Aggregation-Aware Quantization for Graph Neural Networks
Exponential Generalization Bounds with Near-Optimal Rates for $L_q$-Stable Algorithms
Learning to Induce Causal Structure
FoSR: First-order spectral rewiring for addressing oversquashing in GNNs
Achieving Near-Optimal Individual Regret & Low Communications in Multi-Agent Bandits
Online Boundary-Free Continual Learning by Scheduled Data Prior
A Higher Precision Algorithm for Computing the $1$-Wasserstein Distance
Bidirectional Language Models Are Also Few-shot Learners
Delving into Semantic Scale Imbalance
On the Trade-Off between Actionable Explanations and the Right to be Forgotten
Over-parameterized Model Optimization with Polyak-{\L}ojasiewicz Condition
Continuous-time identification of dynamic state-space models by deep subspace encoding
AE-FLOW: Autoencoders with Normalizing Flows for Medical Images Anomaly Detection
SpeedyZero: Mastering Atari with Limited Data and Time
Probabilistically Robust Recourse: Navigating the Trade-offs between Costs and Robustness in Algorithmic Recourse
Sampling-based inference for large linear models, with application to linearised Laplace
Constructive TT-representation of the tensors given as index interaction functions with applications
Adaptive Robust Evidential Optimization For Open Set Detection from Imbalanced Data
Is Model Ensemble Necessary? Model-based RL via a Single Model with Lipschitz Regularized Value Function
Lossless Adaptation of Pretrained Vision Models For Robotic Manipulation
Self-Ensemble Protection: Training Checkpoints Are Good Data Protectors
CANIFE: Crafting Canaries for Empirical Privacy Measurement in Federated Learning
Learning to Estimate Single-View Volumetric Flow Motions without 3D Supervision
Variational Information Pursuit for Interpretable Predictions
DiffusER: Diffusion via Edit-based Reconstruction
Synthetic Data Generation of Many-to-Many Datasets via Random Graph Generation
One-Pixel Shortcut: On the Learning Preference of Deep Neural Networks
Unified Detoxifying and Debiasing in Language Generation via Inference-time Adaptive Optimization
Boosting Causal Discovery via Adaptive Sample Reweighting
Inequality phenomenon in $l_{\infty}$-adversarial training, and its unrealized threats
Understanding weight-magnitude hyperparameters in training binary networks
GeneFace: Generalized and High-Fidelity Audio-Driven 3D Talking Face Synthesis
Copy is All You Need
Grounding Graph Network Simulators using Physical Sensor Observations
Predictive Inference with Feature Conformal Prediction
The Curious Case of Benign Memorization
Accurate Bayesian Meta-Learning by Accurate Task Posterior Inference
Compositional Prompt Tuning with Motion Cues for Open-vocabulary Video Relation Detection
Guiding continuous operator learning through Physics-based boundary constraints
Preference Transformer: Modeling Human Preferences using Transformers for RL
LexMAE: Lexicon-Bottlenecked Pretraining for Large-Scale Retrieval
SWIFT: Rapid Decentralized Federated Learning via Wait-Free Model Communication
An Equal-Size Hard EM Algorithm for Diverse Dialogue Generation
A GNN-Guided Predict-and-Search Framework for Mixed-Integer Linear Programming
On Explaining Neural Network Robustness with Activation Path
Liquid Structural State-Space Models
Approximate Vanishing Ideal Computations at Scale
Multi-task Self-supervised Graph Neural Networks Enable Stronger Task Generalization
Empowering Graph Representation Learning with Test-Time Graph Transformation
A Statistical Framework for Personalized Federated Learning and Estimation: Theory, Algorithms, and Privacy
DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training
Personalized Reward Learning with Interaction-Grounded Learning (IGL)
Ordered GNN: Ordering Message Passing to Deal with Heterophily and Over-smoothing
Defending against Adversarial Audio via Diffusion Model
Unsupervised Learning for Combinatorial Optimization Needs Meta Learning
Revisit Finetuning strategy for Few-Shot Learning to Transfer the Emdeddings
The Augmented Image Prior: Distilling 1000 Classes by Extrapolating from a Single Image
A Non-Asymptotic Analysis of Oversmoothing in Graph Neural Networks
LDMIC: Learning-based Distributed Multi-view Image Coding
Riemannian Metric Learning via Optimal Transport
Meta Learning to Bridge Vision and Language Models for Multimodal Few-Shot Learning
$\mathscr{N}$-WL: A New Hierarchy of Expressivity for Graph Neural Networks
Learning Input-agnostic Manipulation Directions in StyleGAN with Text Guidance
DAVA: Disentangling Adversarial Variational Autoencoder
Test-Time Adaptation via Self-Training with Nearest Neighbor Information
InCoder: A Generative Model for Code Infilling and Synthesis
Artificial Neuronal Ensembles with Learned Context Dependent Gating
A General Rank Preserving Framework for Asymmetric Image Retrieval
Preserving Pre-trained Features Helps Calibrate Fine-tuned Language Models
Can discrete information extraction prompts generalize across language models?
Score-based Continuous-time Discrete Diffusion Models
Any-scale Balanced Samplers for Discrete Space
Statistical Theory of Differentially Private Marginal-based Data Synthesis Algorithms
Reliability of CKA as a Similarity Measure in Deep Learning
When and Why Vision-Language Models Behave like Bags-Of-Words, and What to Do About It?
Order Matters: Agent-by-agent Policy Optimization
FastFill: Efficient Compatible Model Update
Mind the Pool: Convolutional Neural Networks Can Overfit Input Size
DFlow: Learning to Synthesize Better Optical Flow Datasets via a Differentiable Pipeline
Transformer-based World Models Are Happy With 100k Interactions
Differentiable Mathematical Programming for Object-Centric Representation Learning
MAST: Masked Augmentation Subspace Training for Generalizable Self-Supervised Priors
Scalable Subset Sampling with Neural Conditional Poisson Networks
Benchmarking Offline Reinforcement Learning on Real-Robot Hardware
A General Framework For Proving The Equivariant Strong Lottery Ticket Hypothesis
Re-weighting Based Group Fairness Regularization via Classwise Robust Optimization
Empowering Networks With Scale and Rotation Equivariance Using A Similarity Convolution
LMC: Fast Training of GNNs via Subgraph Sampling with Provable Convergence
Neural-based classification rule learning for sequential data
ODAM: Gradient-based Instance-Specific Visual Explanations for Object Detection
Learning with Auxiliary Activation for Memory-Efficient Training
Policy-Based Self-Competition for Planning Problems
Out-of-Distribution Detection based on In-Distribution Data Patterns Memorization with Modern Hopfield Energy
Kernel Neural Optimal Transport
Neural Optimal Transport
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 Small
Discovering Generalizable Multi-agent Coordination Skills from Multi-task Offline Data
Teacher Guided Training: An Efficient Framework for Knowledge Transfer
Measure the Predictive Heterogeneity
Provable Defense Against Geometric Transformations
Tensor-Based Sketching Method for the Low-Rank Approximation of Data Streams.
Data augmentation alone can improve adversarial training
CUTS: Neural Causal Discovery from Irregular Time-Series Data
LilNetX: Lightweight Networks with EXtreme Model Compression and Structured Sparsification
Valid P-Value for Deep Learning-driven Salient Region
Pitfalls of Gaussians as a noise distribution in NCE
Broken Neural Scaling Laws
Bridging the Gap between ANNs and SNNs by Calibrating Offset Spikes
Visually-Augmented Language Modeling
A VAE for Transformers with Nonparametric Variational Information Bottleneck
A theoretical study of inductive biases in contrastive learning
Minimum Variance Unbiased N:M Sparsity for the Neural Gradients
Incremental Learning of Structured Memory via Closed-Loop Transcription
Explaining Temporal Graph Models through an Explorer-Navigator Framework
Backpropagation at the Infinitesimal Inference Limit of Energy-Based Models: Unifying Predictive Coding, Equilibrium Propagation, and Contrastive Hebbian Learning
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models
What Do Self-Supervised Vision Transformers Learn?
Benchmarking Constraint Inference in Inverse Reinforcement Learning
Enhancing the Inductive Biases of Graph Neural ODE for Modeling Physical Systems
ESCHER: Eschewing Importance Sampling in Games by Computing a History Value Function to Estimate Regret
Sampling-free Inference for Ab-Initio Potential Energy Surface Networks
That Label's got Style: Handling Label Style Bias for Uncertain Image Segmentation
On The Relative Error of Random Fourier Features for Preserving Kernel Distance
Towards Inferential Reproducibility of Machine Learning Research
Beyond calibration: estimating the grouping loss of modern neural networks
DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing
Weighted Clock Logic Point Process
Squeeze Training for Adversarial Robustness
Asymptotic Instance-Optimal Algorithms for Interactive Decision Making
Is Forgetting Less a Good Inductive Bias for Forward Transfer?
Near-Optimal Deployment Efficiency in Reward-Free Reinforcement Learning with Linear Function Approximation
Where to Begin? On the Impact of Pre-Training and Initialization in Federated Learning
Long-Tailed Partial Label Learning via Dynamic Rebalancing
Global Explainability of GNNs via Logic Combination of Learned Concepts
Task Ambiguity in Humans and Language Models
Discrete Predictor-Corrector Diffusion Models for Image Synthesis
Recon: Reducing Conflicting Gradients From the Root For Multi-Task Learning
More Centralized Training, Still Decentralized Execution: Multi-Agent Conditional Policy Factorization
PerFedMask: Personalized Federated Learning with Optimized Masking Vectors
Edgeformers: Graph-Empowered Transformers for Representation Learning on Textual-Edge Networks
Imbalanced Semi-supervised Learning with Bias Adaptive Classifier
On Compositional Uncertainty Quantification for Seq2seq Graph Parsing
Free Lunch for Domain Adversarial Training: Environment Label Smoothing
Mastering the Game of No-Press Diplomacy via Human-Regularized Reinforcement Learning and Planning
Brain-like representational straightening of natural movies in robust feedforward neural networks
A Simple Approach for Visual Room Rearrangement: 3D Mapping and Semantic Search
Generative Modeling Helps Weak Supervision (and Vice Versa)
Federated Learning from Small Datasets
Multi-level Protein Structure Pre-training via Prompt Learning
PD-MORL: Preference-Driven Multi-Objective Reinforcement Learning Algorithm
A Mixture-of-Expert Approach to RL-based Dialogue Management
M-L2O: Towards Generalizable Learning-to-Optimize by Test-Time Fast Self-Adaptation
Dichotomy of Control: Separating What You Can Control from What You Cannot
Re-calibrating Feature Attributions for Model Interpretation
Revisiting Populations in multi-agent Communication
Learning multi-scale local conditional probability models of images
Characterizing the Influence of Graph Elements
Disentangling Learning Representations with Density Estimation
Simple and Scalable Nearest Neighbor Machine Translation
Language Models Can Teach Themselves to Program Better
What Is Missing in IRM Training and Evaluation? Challenges and Solutions
A Differential Geometric View and Explainability of GNN on Evolving Graphs
TextGrad: Advancing Robustness Evaluation in NLP by Gradient-Driven Optimization
Consolidator: Mergable Adapter with Group Connections for Visual Adaptation
Cross-Level Distillation and Feature Denoising for Cross-Domain Few-Shot Classification
Continuous PDE Dynamics Forecasting with Implicit Neural Representations
Transfer NAS with Meta-learned Bayesian Surrogates
Hierarchical Sliced Wasserstein Distance
Supervision Complexity and its Role in Knowledge Distillation
LipsFormer: Introducing Lipschitz Continuity to Vision Transformers
Automatic Chain of Thought Prompting in Large Language Models
Near-Optimal Adversarial Reinforcement Learning with Switching Costs
Federated Nearest Neighbor Machine Translation
Language Models are Realistic Tabular Data Generators
Logical Entity Representation in Knowledge-Graphs for Differentiable Rule Learning
MLPInit: Embarrassingly Simple GNN Training Acceleration with MLP Initialization
Data Valuation Without Training of a Model
Words are all you need? Language as an approximation for human similarity judgments
Link Prediction with Non-Contrastive Learning
Impossibly Good Experts and How to Follow Them
Scaling Laws For Deep Learning Based Image Reconstruction
Equal Improvability: A New Fairness Notion Considering the Long-term Impact
Certifiably Robust Policy Learning against Adversarial Multi-Agent Communication
Implicit regularization in Heavy-ball momentum accelerated stochastic gradient descent
Hyperparameter Optimization through Neural Network Partitioning
Proto-Value Networks: Scaling Representation Learning with Auxiliary Tasks
Image to Sphere: Learning Equivariant Features for Efficient Pose Prediction
NANSY++: Unified Voice Synthesis with Neural Analysis and Synthesis
Sequential Attention for Feature Selection
Modeling Multimodal Aleatoric Uncertainty in Segmentation with Mixture of Stochastic Experts
Improved Sample Complexity for Reward-free Reinforcement Learning under Low-rank MDPs
Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning
ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor
Re-Imagen: Retrieval-Augmented Text-to-Image Generator
Towards Robustness Certification Against Universal Perturbations
Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints
Meta-prediction Model for Distillation-Aware NAS on Unseen Datasets
Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?
HyperDeepONet: learning operator with complex target function space using the limited resources via hypernetwork
Self-Supervised Category-Level Articulated Object Pose Estimation with Part-Level SE(3) Equivariance
PiFold: Toward effective and efficient protein inverse folding
Capturing the Motion of Every Joint: 3D Human Pose and Shape Estimation with Independent Tokens
Mind the Gap: Offline Policy Optimization for Imperfect Rewards
Autoregressive Conditional Neural Processes
CASR: Generating Complex Sequences with Autoregressive Self-Boost Refinement
Bias Propagation in Federated Learning
Make-A-Video: Text-to-Video Generation without Text-Video Data
kNN-Diffusion: Image Generation via Large-Scale Retrieval
AudioGen: Textually Guided Audio Generation
DCI-ES: An Extended Disentanglement Framework with Connections to Identifiability
Is a Caption Worth a Thousand Images? A Study on Representation Learning
Almost Linear Constant-Factor Sketching for $\ell_1$ and Logistic Regression
Concept Gradient: Concept-based Interpretation Without Linear Assumption
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD
Neural Agents Struggle to Take Turns in Bidirectional Emergent Communication
Complexity-Based Prompting for Multi-step Reasoning
Summarization Programs: Interpretable Abstractive Summarization with Neural Modular Trees
Human-Guided Fair Classification for Natural Language Processing
An Adaptive Policy to Employ Sharpness-Aware Minimization
NTK-SAP: Improving neural network pruning by aligning training dynamics
Revisiting the Assumption of Latent Separability for Backdoor Defenses
Holistic Adversarially Robust Pruning
Restricted Strong Convexity of Deep Learning Models with Smooth Activations
TVSPrune - Pruning Non-discriminative filters via Total Variation separability of intermediate representations without fine tuning
DFPC: Data flow driven pruning of coupled channels without data.
Graph Neural Network-Inspired Kernels for Gaussian Processes in Semi-Supervised Learning
Minimum Description Length Control
RGI: robust GAN-inversion for mask-free image inpainting and unsupervised pixel-wise anomaly detection
Part-Based Models Improve Adversarial Robustness
Basic Binary Convolution Unit for Binarized Image Restoration Network
Knowledge Distillation based Degradation Estimation for Blind Super-Resolution
Scale-invariant Bayesian Neural Networks with Connectivity Tangent Kernel
Hybrid RL: Using both offline and online data can make RL efficient
Become a Proficient Player with Limited Data through Watching Pure Videos
Edge Guided GANs with Contrastive Learning for Semantic Image Synthesis
Masked Frequency Modeling for Self-Supervised Visual Pre-Training
On the Convergence of AdaGrad(Norm) on $\mathbb{R}^d$: Beyond Convexity, Non-Asymptotic Rate and Acceleration
Curriculum-based Co-design of Morphology and Control of Voxel-based Soft Robots
Instance-wise Batch Label Restoration via Gradients in Federated Learning
Truthful Self-Play
Zeroth-Order Optimization with Trajectory-Informed Derivative Estimation
Matching receptor to odorant with protein language and graph neural networks
Decompositional Generation Process for Instance-Dependent Partial Label Learning
A Graph Neural Network Approach to Automated Model Building in Cryo-EM Maps
Investigating Multi-task Pretraining and Generalization in Reinforcement Learning
Adversarial Imitation Learning with Preferences
Simple Emergent Action Representations from Multi-Task Policy Training
From $t$-SNE to UMAP with contrastive learning
On the Importance and Applicability of Pre-Training for Federated Learning
When to Make and Break Commitments?
Gromov-Wasserstein Autoencoders
3D UX-Net: A Large Kernel Volumetric ConvNet Modernizing Hierarchical Transformer for Medical Image Segmentation
Neural Episodic Control with State Abstraction
The Surprising Computational Power of Nondeterministic Stack RNNs
Feature Reconstruction From Outputs Can Mitigate Simplicity Bias in Neural Networks
Dynamic Update-to-Data Ratio: Minimizing World Model Overfitting
Soft Neighbors are Positive Supporters in Contrastive Visual Representation Learning
FedFA: Federated Feature Augmentation
Efficient Planning in a Compact Latent Action Space
SketchKnitter: Vectorized Sketch Generation with Diffusion Models
Cheap Talk Discovery and Utilization in Multi-Agent Reinforcement Learning
MAESTRO: Open-Ended Environment Design for Multi-Agent Reinforcement Learning
On the Perils of Cascading Robust Classifiers
Conservative Bayesian Model-Based Value Expansion for Offline Policy Optimization
FairGBM: Gradient Boosting with Fairness Constraints
Efficient Model Updates for Approximate Unlearning of Graph-Structured Data
Does Learning from Decentralized Non-IID Unlabeled Data Benefit from Self Supervision?
Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification
How I Learned to Stop Worrying and Love Retraining
Malign Overfitting: Interpolation and Invariance are Fundamentally at Odds
AGRO: Adversarial discovery of error-prone Groups for Robust Optimization
Discovering Latent Knowledge in Language Models Without Supervision
Tier Balancing: Towards Dynamic Fairness over Underlying Causal Factors
RLx2: Training a Sparse Deep Reinforcement Learning Model from Scratch
Deep Declarative Dynamic Time Warping for End-to-End Learning of Alignment Paths
PGrad: Learning Principal Gradients For Domain Generalization
Representational Dissimilarity Metric Spaces for Stochastic Neural Networks
Anamnesic Neural Differential Equations with Orthogonal Polynomial Projections
Neural Design for Genetic Perturbation Experiments
Topology-aware Robust Optimization for Out-of-Distribution Generalization
simpleKT: A Simple But Tough-to-Beat Baseline for Knowledge Tracing
DecAF: Joint Decoding of Answers and Logical Forms for Question Answering over Knowledge Bases
Dr.Spider: A Diagnostic Evaluation Benchmark towards Text-to-SQL Robustness
Disentangling the Mechanisms Behind Implicit Regularization in SGD
GLM-130B: An Open Bilingual Pre-trained Model
DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems
The Onset of Variance-Limited Behavior for Networks in the Lazy and Rich Regimes
Min-Max Multi-objective Bilevel Optimization with Applications in Robust Machine Learning
Vision Transformer Adapter for Dense Predictions
Learning Simultaneous Navigation and Construction in Grid Worlds
GAIN: On the Generalization of Instructional Action Understanding
Text Summarization with Oracle Expectation
Learning Sparse Group Models Through Boolean Relaxation
Combinatorial-Probabilistic Trade-Off: P-Values of Community Properties Test in the Stochastic Block Models
TEMPERA: Test-Time Prompt Editing via Reinforcement Learning
How to Train your HIPPO: State Space Models with Generalized Orthogonal Basis Projections
Sample Complexity of Nonparametric Off-Policy Evaluation on Low-Dimensional Manifolds using Deep Networks
Learning Kernelized Contextual Bandits in a Distributed and Asynchronous Environment
Representation Learning for Low-rank General-sum Markov Games
Deep Reinforcement Learning for Cost-Effective Medical Diagnosis
Layer Grafted Pre-training: Bridging Contrastive Learning And Masked Image Modeling For Label-Efficient Representations
Symmetries, Flat Minima, and the Conserved Quantities of Gradient Flow
Noise Injection Node Regularization for Robust Learning
Differentially Private $L_2$-Heavy Hitters in the Sliding Window Model
Temporal Domain Generalization with Drift-Aware Dynamic Neural Networks
Disparate Impact in Differential Privacy from Gradient Misalignment
Few-Shot Domain Adaptation For End-to-End Communication
Planckian Jitter: countering the color-crippling effects of color jitter on self-supervised training
Calibrating Transformers via Sparse Gaussian Processes
Associative Memory Augmented Asynchronous Spatiotemporal Representation Learning for Event-based Perception
ViewCo: Discovering Text-Supervised Segmentation Masks via Multi-View Semantic Consistency
SAM as an Optimal Relaxation of Bayes
The Dark Side of AutoML: Towards Architectural Backdoor Search
ISS: Image as Stepping Stone for Text-Guided 3D Shape Generation
BrainBERT: Self-supervised representation learning for intracranial recordings
Flow Matching for Generative Modeling
Guiding Safe Exploration with Weakest Preconditions
Accelerating Hamiltonian Monte Carlo via Chebyshev Integration Time
Causal Estimation for Text Data with (Apparent) Overlap Violations
Static Prediction of Runtime Errors by Learning to Execute Programs with External Resource Descriptions
TANGOS: Regularizing Tabular Neural Networks through Gradient Orthogonalization and Specialization
Hungry Hungry Hippos: Towards Language Modeling with State Space Models
Understanding new tasks through the lens of training data via exponential tilting
Perfectly Secure Steganography Using Minimum Entropy Coupling
Predictor-corrector algorithms for stochastic optimization under gradual distribution shift
A Call to Reflect on Evaluation Practices for Failure Detection in Image Classification
FunkNN: Neural Interpolation for Functional Generation
Continual Pre-training of Language Models
Accelerated Single-Call Methods for Constrained Min-Max Optimization
MOAT: Alternating Mobile Convolution and Attention Brings Strong Vision Models
Lower Bounds on the Depth of Integral ReLU Neural Networks via Lattice Polytopes
Implicit Bias of Large Depth Networks: a Notion of Rank for Nonlinear Functions
Learning to CROSS exchange to solve min-max vehicle routing problems
Evolving Populations of Diverse RL Agents with MAP-Elites
Equivariant Descriptor Fields: SE(3)-Equivariant Energy-Based Models for End-to-End Visual Robotic Manipulation Learning
PandA: Unsupervised Learning of Parts and Appearances in the Feature Maps of GANs
Outcome-directed Reinforcement Learning by Uncertainty \& Temporal Distance-Aware Curriculum Goal Generation
Learning Adversarial Linear Mixture Markov Decision Processes with Bandit Feedback and Unknown Transition
The In-Sample Softmax for Offline Reinforcement Learning
Compositional Law Parsing with Latent Random Functions
Reversible Column Networks
Robust Algorithms on Adaptive Inputs from Bounded Adversaries
Selection-Inference: Exploiting Large Language Models for Interpretable Logical Reasoning
Transformers are Sample-Efficient World Models
Towards Minimax Optimal Reward-free Reinforcement Learning in Linear MDPs
Robust Explanation Constraints for Neural Networks
Domain Generalization via Heckman-type Selection Models
Quasi-optimal Reinforcement Learning with Continuous Actions
Generalization Bounds for Federated Learning: Fast Rates, Unparticipating Clients and Unbounded Losses
Agree to Disagree: Diversity through Disagreement for Better Transferability
Which Layer is Learning Faster? A Systematic Exploration of Layer-wise Convergence Rate for Deep Neural Networks
Strong inductive biases provably prevent harmless interpolation
Parametrizing Product Shape Manifolds by Composite Networks
Quantifying and Mitigating the Impact of Label Errors on Model Disparity Metrics
Ollivier-Ricci Curvature for Hypergraphs: A Unified Framework
Hidden Markov Transformer for Simultaneous Machine Translation
Sequential Latent Variable Models for Few-Shot High-Dimensional Time-Series Forecasting
Minimax Optimal Kernel Operator Learning via Multilevel Training
MeshDiffusion: Score-based Generative 3D Mesh Modeling
ChordMixer: A Scalable Neural Attention Model for Sequences with Different Length
FaiREE: fair classification with finite-sample and distribution-free guarantee
On the duality between contrastive and non-contrastive self-supervised learning
ImageNet-X: Understanding Model Mistakes with Factor of Variation Annotations
Emergence of Maps in the Memories of Blind Navigation Agents
Molecular Geometry Pretraining with SE(3)-Invariant Denoising Distance Matching
Scaleformer: Iterative Multi-scale Refining Transformers for Time Series Forecasting
Simplicial Hopfield networks
Adversarial Training of Self-supervised Monocular Depth Estimation against Physical-World Attacks
Turning the Curse of Heterogeneity in Federated Learning into a Blessing for Out-of-Distribution Detection
D4AM: A General Denoising Framework for Downstream Acoustic Models
Finding Actual Descent Directions for Adversarial Training
Interpretations of Domain Adaptations via Layer Variational Analysis
Data Continuity Matters: Improving Sequence Modeling with Lipschitz Regularizer
MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction
Policy Expansion for Bridging Offline-to-Online Reinforcement Learning
Mitigating Memorization of Noisy Labels via Regularization between Representations
ROSCOE: A Suite of Metrics for Scoring Step-by-Step Reasoning
Learning Cut Selection for Mixed-Integer Linear Programming via Hierarchical Sequence Model
Planning Goals for Exploration
DASHA: Distributed Nonconvex Optimization with Communication Compression and Optimal Oracle Complexity
Self-Guided Noise-Free Data Generation for Efficient Zero-Shot Learning
AutoGT: Automated Graph Transformer Architecture Search
Progress measures for grokking via mechanistic interpretability
Active Learning in Bayesian Neural Networks with Balanced Entropy Learning Principle
Variance-Aware Sparse Linear Bandits
Label-free Concept Bottleneck Models
The Role of ImageNet Classes in Fréchet Inception Distance
Embedding Fourier for Ultra-High-Definition Low-Light Image Enhancement
Optimal Transport for Offline Imitation Learning
Does Zero-Shot Reinforcement Learning Exist?
A Self-Attention Ansatz for Ab-initio Quantum Chemistry
A Neural Mean Embedding Approach for Back-door and Front-door Adjustment
TranSpeech: Speech-to-Speech Translation With Bilateral Perturbation
Certified Training: Small Boxes are All You Need
CLIP-ViP: Adapting Pre-trained Image-Text Model to Video-Language Alignment
BSTT: A Bayesian Spatial-Temporal Transformer for Sleep Staging
Optimizing Bi-Encoder for Named Entity Recognition via Contrastive Learning
Pre-training via Denoising for Molecular Property Prediction
AANG : Automating Auxiliary Learning
Dual Algorithmic Reasoning
SoftMatch: Addressing the Quantity-Quality Tradeoff in Semi-supervised Learning
Equivariant Energy-Guided SDE for Inverse Molecular Design
Quantile Risk Control: A Flexible Framework for Bounding the Probability of High-Loss Predictions
Revocable Deep Reinforcement Learning with Affinity Regularization for Outlier-Robust Graph Matching
On the Feasibility of Cross-Task Transfer with Model-Based Reinforcement Learning
Learning Continuous Normalizing Flows For Faster Convergence To Target Distribution via Ascent Regularizations
A Simple Yet Powerful Deep Active Learning With Snapshots Ensembles
Towards Interpretable Deep Reinforcement Learning with Human-Friendly Prototypes
The Trade-off between Universality and Label Efficiency of Representations from Contrastive Learning
Simplicial Embeddings in Self-Supervised Learning and Downstream Classification
Learning Label Encodings for Deep Regression
Offline Congestion Games: How Feedback Type Affects Data Coverage Requirement
Neural Implicit Shape Editing using Boundary Sensitivity
Multifactor Sequential Disentanglement via Structured Koopman Autoencoders
Asynchronous Distributed Bilevel Optimization
Towards Open Temporal Graph Neural Networks
VA-DepthNet: A Variational Approach to Single Image Depth Prediction
Graph Contrastive Learning for Skeleton-based Action Recognition
Strategic Classification with Graph Neural Networks
Memory Gym: Partially Observable Challenges to Memory-Based Agents
What learning algorithm is in-context learning? Investigations with linear models
Discovering Policies with DOMiNO: Diversity Optimization Maintaining Near Optimality
Learning to Extrapolate: A Transductive Approach
Learning Achievement Structure for Structured Exploration in Domains with Sparse Reward
How Sharpness-Aware Minimization Minimizes Sharpness?
Benign Overfitting in Classification: Provably Counter Label Noise with Larger Models
DreamFusion: Text-to-3D using 2D Diffusion
The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation
What Can we Learn From The Selective Prediction And Uncertainty Estimation Performance Of 523 Imagenet Classifiers?
The Symmetric Generalized Eigenvalue Problem as a Nash Equilibrium
Integrating Symmetry into Differentiable Planning with Steerable Convolutions
Deep Generative Symbolic Regression
Addressing Parameter Choice Issues in Unsupervised Domain Adaptation by Aggregation
Correlative Information Maximization Based Biologically Plausible Neural Networks for Correlated Source Separation
Batch Multivalid Conformal Prediction
Neural Networks and the Chomsky Hierarchy
SoftZoo: A Soft Robot Co-design Benchmark For Locomotion In Diverse Environments
A System for Morphology-Task Generalization via Unified Representation and Behavior Distillation
This Looks Like It Rather Than That: ProtoKNN For Similarity-Based Classifiers
DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion
Towards Understanding GD with Hard and Conjugate Pseudo-labels for Test-Time Adaptation
Context-enriched molecule representations improve few-shot drug discovery
Boosting Adversarial Transferability using Dynamic Cues
Scalable and Equivariant Spherical CNNs by Discrete-Continuous (DISCO) Convolutions
When Source-Free Domain Adaptation Meets Learning with Noisy Labels
Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
CrAM: A Compression-Aware Minimizer
Semi-Implicit Variational Inference via Score Matching
Generative Augmented Flow Networks
Accurate Neural Training with 4-bit Matrix Multiplications at Standard Formats
Multiple sequence alignment as a sequence-to-sequence learning problem
A Laplace-inspired Distribution on SO(3) for Probabilistic Rotation Estimation
A Primal-Dual Framework for Transformers and Neural Networks
Solving Constrained Variational Inequalities via a First-order Interior Point-based Method
Spectral Augmentation for Self-Supervised Learning on Graphs
Neural Causal Models for Counterfactual Identification and Estimation
The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks
Uni-Mol: A Universal 3D Molecular Representation Learning Framework
PASHA: Efficient HPO and NAS with Progressive Resource Allocation
Sign and Basis Invariant Networks for Spectral Graph Representation Learning
Betty: An Automatic Differentiation Library for Multilevel Optimization
EA-HAS-Bench: Energy-aware Hyperparameter and Architecture Search Benchmark
Structure by Architecture: Structured Representations without Regularization
A Non-monotonic Self-terminating Language Model
Learning topology-preserving data representations
Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Approach
Quantifying Memorization Across Neural Language Models
Task-customized Masked Autoencoder via Mixture of Cluster-conditional Experts
KnowDA: All-in-One Knowledge Mixture Model for Data Augmentation in Low-Resource NLP
Scaffolding a Student to Instill Knowledge
Efficient Edge Inference by Selective Query
Graph Signal Sampling for Inductive One-Bit Matrix Completion: a Closed-form Solution
Faster Gradient-Free Methods for Escaping Saddle Points
Can Agents Run Relay Race with Strangers? Generalization of RL to Out-of-Distribution Trajectories
Semi-Parametric Inducing Point Networks and Neural Processes
Taking a Step Back with KCal: Multi-Class Kernel-Based Calibration for Deep Neural Networks
Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task
Progressive Prompts: Continual Learning for Language Models
Hyperbolic Self-paced Learning for Self-supervised Skeleton-based Action Representations
Iterative Patch Selection for High-Resolution Image Recognition
ReAct: Synergizing Reasoning and Acting in Language Models
Efficient Offline Policy Optimization with a Learned Model
Learning to Solve Constraint Satisfaction Problems with Recurrent Transformer
Timing is Everything: Learning to Act Selectively with Costly Actions and Budgetary Constraints
New Insights for the Stability-Plasticity Dilemma in Online Continual Learning
Temporal Dependencies in Feature Importance for Time Series Prediction
Video Scene Graph Generation from Single-Frame Weak Supervision
Versatile Neural Processes for Learning Implicit Neural Representations
Human Motion Diffusion Model
Compressing multidimensional weather and climate data into neural networks
SGDA with shuffling: faster convergence for nonconvex-PŁ minimax optimization
Simplified State Space Layers for Sequence Modeling
Minimalistic Unsupervised Representation Learning with the Sparse Manifold Transform
SQA3D: Situated Question Answering in 3D Scenes
A Minimalist Dataset for Systematic Generalization of Perception, Syntax, and Semantics
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together!
Rethinking the Expressive Power of GNNs via Graph Biconnectivity
BAYES RISK CTC: CONTROLLABLE CTC ALIGNMENT IN SEQUENCE-TO-SEQUENCE TASKS
Efficiently Computing Nash Equilibria in Adversarial Team Markov Games
Unsupervised Model Selection for Time Series Anomaly Detection
Temporal Disentanglement of Representations for Improved Generalisation in Reinforcement Learning
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
NTFields: Neural Time Fields for Physics-Informed Robot Motion Planning
Backpropagation through Combinatorial Algorithms: Identity with Projection Works
DiffEdit: Diffusion-based semantic image editing with mask guidance
Towards Stable Test-time Adaptation in Dynamic Wild World
Denoising Masked Autoencoders Help Robust Classification
EquiMod: An Equivariance Module to Improve Visual Instance Discrimination
One Transformer Can Understand Both 2D & 3D Molecular Data
Warping the Space: Weight Space Rotation for Class-Incremental Few-Shot Learning
FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning
Learning with Logical Constraints but without Shortcut Satisfaction
Robust and Controllable Object-Centric Learning through Energy-based Models
Do We Really Need Complicated Model Architectures For Temporal Networks?
DepthFL : Depthwise Federated Learning for Heterogeneous Clients
Over-Training with Mixup May Hurt Generalization
Self-supervised learning with rotation-invariant kernels
Towards Understanding and Mitigating Dimensional Collapse in Heterogeneous Federated Learning
Causal Balancing for Domain Generalization
Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning
Why (and When) does Local SGD Generalize Better than SGD?
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
Imitating Human Behaviour with Diffusion Models
Better Generative Replay for Continual Federated Learning
WikiWhy: Answering and Explaining Cause-and-Effect Questions
Exploring The Role of Mean Teachers in Self-supervised Masked Auto-Encoders
EVC: Towards Real-Time Neural Image Compression with Mask Decay
Sparse Token Transformer with Attention Back Tracking
Short-Term Memory Convolutions
MaskViT: Masked Visual Pre-Training for Video Prediction
REVISITING PRUNING AT INITIALIZATION THROUGH THE LENS OF RAMANUJAN GRAPH
Sharper Bounds for Uniformly Stable Algorithms with Stationary Mixing Process
Analogy-Forming Transformers for Few-Shot 3D Parsing
FluidLab: A Differentiable Environment for Benchmarking Complex Fluid Manipulation
Meta Temporal Point Processes
Images as Weight Matrices: Sequential Image Generation Through Synaptic Learning Rules
How to prepare your task head for finetuning
Dataset Pruning: Reducing Training Data by Examining Generalization Influence
Graph Neural Networks are Inherently Good Generalizers: Insights by Bridging GNNs and MLPs
KwikBucks: Correlation Clustering with Cheap-Weak and Expensive-Strong Signals
Automated Data Augmentations for Graph Classification
Population-size-Aware Policy Optimization for Mean-Field Games
Learning Soft Constraints From Constrained Expert Demonstrations
Energy-based Out-of-Distribution Detection for Graph Neural Networks
Learning Fast and Slow for Online Time Series Forecasting
Towards Lightweight, Model-Agnostic and Diversity-Aware Active Anomaly Detection
An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion
Neural Compositional Rule Learning for Knowledge Graph Reasoning
How robust is unsupervised representation learning to distribution shift?
Distilling Cognitive Backdoor Patterns within an Image
Iterative Circuit Repair Against Formal Specifications
MocoSFL: enabling cross-client collaborative self-supervised learning
Better Teacher Better Student: Dynamic Prior Knowledge for Knowledge Distillation
MECTA: Memory-Economic Continual Test-Time Model Adaptation
Continuous pseudo-labeling from the start
IDEAL: Query-Efficient Data-Free Learning from Black-Box Models
Offline Q-learning on Diverse Multi-Task Data Both Scales And Generalizes
DiGress: Discrete Denoising diffusion for graph generation
Deja Vu: Continual Model Generalization for Unseen Domains
Q-Pensieve: Boosting Sample Efficiency of Multi-Objective RL Through Memory Sharing of Q-Snapshots
Adversarial Diversity in Hanabi
Statistical Efficiency of Score Matching: The View from Isoperimetry
Leveraging Future Relationship Reasoning for Vehicle Trajectory Prediction
LMSeg: Language-guided Multi-dataset Segmentation
RoPAWS: Robust Semi-supervised Representation Learning from Uncurated Data
Toward Adversarial Training on Contextualized Language Representation
BC-IRL: Learning Generalizable Reward Functions from Demonstrations
Omnigrok: Grokking Beyond Algorithmic Data
GRACE-C: Generalized Rate Agnostic Causal Estimation via Constraints
Actionable Neural Representations: Grid Cells from Minimal Constraints
Optimal Activation Functions for the Random Features Regression Model
EUCLID: Towards Efficient Unsupervised Reinforcement Learning with Multi-choice Dynamics Model
Sparse tree-based Initialization for Neural Networks
A Closer Look at Model Adaptation using Feature Distortion and Simplicity Bias
Cycle to Clique (Cy2C) Graph Neural Network: A Sight to See beyond Neighborhood Aggregation
MaskFusion: Feature Augmentation for Click-Through Rate Prediction via Input-adaptive Mask Fusion
Rethinking Self-Supervised Visual Representation Learning in Pre-training for 3D Human Pose and Shape Estimation
Learned Index with Dynamic $\epsilon$
Boosting Multiagent Reinforcement Learning via Permutation Invariant and Permutation Equivariant Networks
DAG Learning on the Permutahedron
UNICORN: A Unified Backdoor Trigger Inversion Framework
PV3D: A 3D Generative Model for Portrait Video Generation
UNIFIED-IO: A Unified Model for Vision, Language, and Multi-modal Tasks
On The Specialization of Neural Modules
Learning What and Where: Disentangling Location and Identity Tracking Without Supervision
Mass-Editing Memory in a Transformer
Few-shot Backdoor Attacks via Neural Tangent Kernels
NERDS: A General Framework to Train Camera Denoisers from Raw-RGB Noisy Image Pairs
Programmatically Grounded, Compositionally Generalizable Robotic Manipulation
Truncated Diffusion Probabilistic Models and Diffusion-based Adversarial Auto-Encoders
ILA-DA: Improving Transferability of Intermediate Level Attack with Data Augmentation
Contrastive Alignment of Vision to Language Through Parameter-Efficient Transfer Learning
Diffusion-GAN: Training GANs with Diffusion
BEEF: Bi-Compatible Class-Incremental Learning via Energy-Based Expansion and Fusion
QuAnt: Quantum Annealing with Learnt Couplings
MACTA: A Multi-agent Reinforcement Learning Approach for Cache Timing Attacks and Detection
Schema Inference for Interpretable Image Classification
Relative representations enable zero-shot latent space communication
On Achieving Optimal Adversarial Test Error
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Spiking Convolutional Neural Networks for Text Classification
A Kernel Perspective of Skip Connections in Convolutional Networks
In-Situ Text-Only Adaptation of Speech Models with Low-Overhead Speech Imputations
Confidence-Conditioned Value Functions for Offline Reinforcement Learning
Unsupervised Semantic Segmentation with Self-supervised Object-centric Representations
Analyzing Tree Architectures in Ensembles via Neural Tangent Kernel
Aligning Model and Macaque Inferior Temporal Cortex Representations Improves Model-to-Human Behavioral Alignment and Adversarial Robustness
Progressive Voronoi Diagram Subdivision Enables Accurate Data-free Class-Incremental Learning
Domain-Indexing Variational Bayes: Interpretable Domain Index for Domain Adaptation
Explicitly Minimizing the Blur Error of Variational Autoencoders
Learning to Estimate Shapley Values with Vision Transformers
HotProtein: A Novel Framework for Protein Thermostability Prediction and Editing
Hyper-Decision Transformer for Efficient Online Policy Adaptation
Loss Landscapes are All You Need: Neural Network Generalization Can Be Explained Without the Implicit Bias of Gradient Descent
Sparsity-Constrained Optimal Transport
Real-Time Image Demoir$\acute{e}$ing on Mobile Devices
RPM: Generalizable Multi-Agent Policies for Multi-Agent Reinforcement Learning
MEDICAL IMAGE UNDERSTANDING WITH PRETRAINED VISION LANGUAGE MODELS: A COMPREHENSIVE STUDY
Understanding Influence Functions and Datamodels via Harmonic Analysis
Stochastic No-regret Learning for General Games with Variance Reduction
When Data Geometry Meets Deep Function: Generalizing Offline Reinforcement Learning
MIMT: Masked Image Modeling Transformer for Video Compression
Graph-based Deterministic Policy Gradient for Repetitive Combinatorial Optimization Problems
Efficient Discrete Multi Marginal Optimal Transport Regularization
Evaluating Representations with Readout Model Switching
Sequential Learning of Neural Networks for Prequential MDL
Explaining RL Decisions with Trajectories
Efficient Conditionally Invariant Representation Learning
Stable Target Field for Reduced Variance Score Estimation in Diffusion Models
Relative Behavioral Attributes: Filling the Gap between Symbolic Goal Specification and Reward Learning from Human Preferences
ROCO: A General Framework for Evaluating Robustness of Combinatorial Optimization Solvers on Graphs
DINO as a von Mises-Fisher mixture model
The Lie Derivative for Measuring Learned Equivariance
Continuized Acceleration for Quasar Convex Functions in Non-Convex Optimization
Meta-Learning in Games
Universal Approximation Theorems for Differentiable Geometric Deep Learning
Provable Memorization Capacity of Transformers
Bridge the Inference Gaps of Neural Processes via Expectation Maximization
Masked Vision and Language Modeling for Multi-modal Representation Learning
Massively Scaling Heteroscedastic Classifiers
CROM: Continuous Reduced-Order Modeling of PDEs Using Implicit Neural Representations
Certified Defences Against Adversarial Patch Attacks on Semantic Segmentation
Last Layer Re-Training is Sufficient for Robustness to Spurious Correlations
Decision Transformer under Random Frame Dropping
Visual Classification via Description from Large Language Models
De Novo Molecular Generation via Connection-aware Motif Mining
Targeted Hyperparameter Optimization with Lexicographic Preferences Over Multiple Objectives
Towards Smooth Video Composition
Revisiting Graph Adversarial Attack and Defense From a Data Distribution Perspective
How to Exploit Hyperspherical Embeddings for Out-of-Distribution Detection?
How Much Space Has Been Explored? Measuring the Chemical Space Covered by Databases and Machine-Generated Molecules
E-CRF: Embedded Conditional Random Field for Boundary-caused Class Weights Confusion in Semantic Segmentation
Online Bias Correction for Task-Free Continual Learning
Arbitrary Virtual Try-on Network: Characteristics Representation and Trade-off between Body and Clothing
Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis Function Decomposition
Don’t fear the unlabelled: safe semi-supervised learning via debiasing
Learning differentiable solvers for systems with hard constraints
DM-NeRF: 3D Scene Geometry Decomposition and Manipulation from 2D Images
SlotFormer: Unsupervised Visual Dynamics Simulation with Object-Centric Models
A framework for benchmarking Class-out-of-distribution detection and its application to ImageNet
Token Merging: Your ViT But Faster
NAGphormer: A Tokenized Graph Transformer for Node Classification in Large Graphs
Ensuring DNN Solution Feasibility for Optimization Problems with Linear Constraints
Retrieval-based Controllable Molecule Generation
Prompt-to-Prompt Image Editing with Cross-Attention Control
Localized Randomized Smoothing for Collective Robustness Certification
Hebbian Deep Learning Without Feedback
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs
Generalizing and Decoupling Neural Collapse via Hyperspherical Uniformity Gap
Distributionally Robust Post-hoc Classifiers under Prior Shifts
More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using Sparsity
Stochastic Multi-Person 3D Motion Forecasting
TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second
Learning Structured Representations by Embedding Class Hierarchy
Multi-Objective Reinforcement Learning: Convexity, Stationarity and Pareto Optimality
Asynchronous Gradient Play in Zero-Sum Multi-agent Games
Novel View Synthesis with Diffusion Models
Harnessing Mixed Offline Reinforcement Learning Datasets via Trajectory Weighting
Behavior Prior Representation learning for Offline Reinforcement Learning
Trading Information between Latents in Hierarchical Variational Autoencoders
MEDFAIR: Benchmarking Fairness for Medical Imaging
Decoupled Training for Long-Tailed Classification With Stochastic Representations
Fuzzy Alignments in Directed Acyclic Graph for Non-Autoregressive Machine Translation
Building a Subspace of Policies for Scalable Continual Learning
Unbiased Stochastic Proximal Solver for Graph Neural Networks with Equilibrium States
3D Segmenter: 3D Transformer based Semantic Segmentation via 2D Panoramic Distillation
TabCaps: A Capsule Neural Network for Tabular Data Classification with BoW Routing
GOOD: Exploring geometric cues for detecting objects in an open world
Average Sensitivity of Decision Tree Learning
Towards a Unified Theoretical Understanding of Non-contrastive Learning via Rank Differential Mechanism
Tailoring Language Generation Models under Total Variation Distance
H2RBox: Horizontal Box Annotation is All You Need for Oriented Object Detection
The KFIoU Loss for Rotated Object Detection
Systematic Rectification of Language Models via Dead-end Analysis
Anisotropic Message Passing: Graph Neural Networks with Directional and Long-Range Interactions
SYNC: SAFETY-AWARE NEURAL CONTROL FOR STABILIZING STOCHASTIC DELAY-DIFFERENTIAL EQUATIONS
PatchDCT: Patch Refinement for High Quality Instance Segmentation
A Message Passing Perspective on Learning Dynamics of Contrastive Learning
A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental Learning
Improved Convergence of Differential Private SGD with Gradient Clipping
Toeplitz Neural Network for Sequence Modeling
Sub-Task Decomposition Enables Learning in Sequence to Sequence Tasks
On the Word Boundaries of Emergent Languages Based on Harris's Articulation Scheme
Effective Self-supervised Pre-training on Low-compute Networks without Distillation
On the Soft-Subnetwork for Few-Shot Class Incremental Learning
TiAda: A Time-scale Adaptive Algorithm for Nonconvex Minimax Optimization
Voint Cloud: Multi-View Point Cloud Representation for 3D Understanding
Approximate Nearest Neighbor Search through Modern Error-Correcting Codes
Boosting the Cycle Counting Power of Graph Neural Networks with I$^2$-GNNs
Offline Reinforcement Learning with Differentiable Function Approximation is Provably Efficient
DENSE RGB SLAM WITH NEURAL IMPLICIT MAPS
Monocular Scene Reconstruction with 3D SDF Transformers
Learning Heterogeneous Interaction Strengths by Trajectory Prediction with Graph Neural Network
Online Low Rank Matrix Completion
Robust Fair Clustering: A Novel Fairness Attack and Defense Framework
Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning
Generalize Learned Heuristics to Solve Large-scale Vehicle Routing Problems in Real-time
On the complexity of nonsmooth automatic differentiation
DaxBench: Benchmarking Deformable Object Manipulation with Differentiable Physics
Exploring perceptual straightness in learned visual representations
CO3: Cooperative Unsupervised 3D Representation Learning for Autonomous Driving
Understanding the Role of Nonlinearity in Training Dynamics of Contrastive Learning
Bag of Tricks for Unsupervised Text-to-Speech
Information-Theoretic Characterization of the Generalization Error for Iterative Semi-Supervised Learning
Composing Task Knowledge With Modular Successor Feature Approximators
Advancing Radiograph Representation Learning with Masked Record Modeling
Verifying the Union of Manifolds Hypothesis for Image Data
GAMR: A Guided Attention Model for (visual) Reasoning
Simple initialization and parametrization of sinusoidal networks via their kernel bandwidth
Energy-Based Test Sample Adaptation for Domain Generalization
On the Saturation Effect of Kernel Ridge Regression
Re-parameterizing Your Optimizers rather than Architectures
PLOT: Prompt Learning with Optimal Transport for Vision-Language Models
Linearly Mapping from Image to Text Space
Protein Representation Learning via Knowledge Enhanced Primary Structure Reasoning
The Provable Benefit of Unsupervised Data Sharing for Offline Reinforcement Learning
HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers
Exploring Active 3D Object Detection from a Generalization Perspective
Gradient Gating for Deep Multi-Rate Learning on Graphs
Diffusion Models Already Have A Semantic Latent Space
Masked Image Modeling with Denoising Contrast
GoBigger: A Scalable Platform for Cooperative-Competitive Multi-Agent Interactive Simulation
Contrastive Learning for Unsupervised Domain Adaptation of Time Series
TILP: Differentiable Learning of Temporal Logical Rules on Knowledge Graphs
Masked Unsupervised Self-training for Label-free Image Classification
Learning the Positions in CountSketch
Deep Ensembles for Graphs with Higher-order Dependencies
Globally Optimal Training of Neural Networks with Threshold Activation Functions
Sound Randomized Smoothing in Floating-Point Arithmetic
Partially Observable RL with B-Stability: Unified Structural Condition and Sharp Sample-Efficient Algorithms
Out-of-distribution Representation Learning for Time Series Classification
AnyDA: Anytime Domain Adaptation
Koopman Neural Operator Forecaster for Time-series with Temporal Distributional Shifts
Deep Generative Modeling on Limited Data with Regularization by Nontransferable Pre-trained Models
Sampling with Mollified Interaction Energy Descent
Multimodal Federated Learning via Contrastive Representation Ensemble
Leveraging Importance Weights in Subset Selection
Can CNNs Be More Robust Than Transformers?
Optimistic Exploration with Learned Features Provably Solves Markov Decision Processes with Neural Dynamics
Trainable Weight Averaging: Efficient Training by Optimizing Historical Solutions
Causal Imitation Learning via Inverse Reinforcement Learning
FIT: A Metric for Model Sensitivity
Pushing the Accuracy-Group Robustness Frontier with Introspective Self-play
Progressively Compressed Auto-Encoder for Self-supervised Representation Learning
Understanding Edge-of-Stability Training Dynamics with a Minimalist Example
Learning to Decompose Visual Features with Latent Textual Prompts
Dual Diffusion Implicit Bridges for Image-to-Image Translation
Calibrating Sequence likelihood Improves Conditional Language Generation
Learning Proximal Operators to Discover Multiple Optima
REPAIR: REnormalizing Permuted Activations for Interpolation Repair
Neural Radiance Field Codebooks
Discrete Contrastive Diffusion for Cross-Modal Music and Image Generation
Diffusion Probabilistic Modeling of Protein Backbones in 3D for the motif-scaffolding problem
ISAAC Newton: Input-based Approximate Curvature for Newton's Method
Revisiting adapters with adversarial training
Heterogeneous Neuronal and Synaptic Dynamics for Spike-Efficient Unsupervised Learning: Theory and Design Principles
Rethinking Graph Lottery Tickets: Graph Sparsity Matters
Private Federated Learning Without a Trusted Server: Optimal Algorithms for Convex Losses
Unicom: Universal and Compact Representation Learning for Image Retrieval
Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model
Symbolic Physics Learner: Discovering governing equations via Monte Carlo tree search
Flow Annealed Importance Sampling Bootstrap
Calibration Matters: Tackling Maximization Bias in Large-scale Advertising Recommendation Systems
Replicable Bandits
Eva: Practical Second-order Optimization with Kronecker-vectorized Approximation
On the Robustness of Safe Reinforcement Learning under Observational Perturbations
Image as Set of Points
Trainability Preserving Neural Pruning
Graph Neural Networks for Link Prediction with Subgraph Sketching
Learning Locality and Isotropy in Dialogue Modeling
A Unified Algebraic Perspective on Lipschitz Neural Networks
Information-Theoretic Diffusion
Evaluating Long-Term Memory in 3D Mazes
Contrastive Meta-Learning for Partially Observable Few-Shot Learning
Hebbian and Gradient-based Plasticity Enables Robust Memory and Rapid Learning in RNNs
Proactive Multi-Camera Collaboration for 3D Human Pose Estimation
Promptagator: Few-shot Dense Retrieval From 8 Examples
Backstepping Temporal Difference Learning
Human MotionFormer: Transferring Human Motions with Vision Transformers
Memorization-Dilation: Modeling Neural Collapse Under Noise
On Representing Mixed-Integer Linear Programs by Graph Neural Networks
Revisiting Robustness in Graph Machine Learning
Parallel Deep Neural Networks Have Zero Duality Gap
Reward Design with Language Models
Contrastive Corpus Attribution for Explaining Representations
Multi-domain image generation and translation with identifiability guarantees
Continual evaluation for lifelong learning: Identifying the stability gap
Can We Faithfully Represent Absence States to Compute Shapley Values on a DNN?
Dataless Knowledge Fusion by Merging Weights of Language Models
Long Range Language Modeling via Gated State Spaces
Transformer Meets Boundary Value Inverse Problems
Masked Distillation with Receptive Tokens
Universal Vision-Language Dense Retrieval: Learning A Unified Representation Space for Multi-Modal Retrieval
TextShield: Beyond Successfully Detecting Adversarial Sentences in text classification
Meta-learning Adaptive Deep Kernel Gaussian Processes for Molecular Property Prediction
Sparse Random Networks for Communication-Efficient Federated Learning
MetaGL: Evaluation-Free Selection of Graph Learning Models via Meta-Learning
Scalable Batch-Mode Deep Bayesian Active Learning via Equivalence Class Annealing
Spatial Attention Kinetic Networks with E(n)-Equivariance
Distributed Differential Privacy in Multi-Armed Bandits
Coverage-centric Coreset Selection for High Pruning Rates
Achieving Sub-linear Regret in Infinite Horizon Average Reward Constrained MDP with Linear Function Approximation
Encoding Recurrence into Transformers
Learning Hyper Label Model for Programmatic Weak Supervision
FedSpeed: Larger Local Interval, Less Communication Round, and Higher Generalization Accuracy
CLARE: Conservative Model-Based Reward Learning for Offline Inverse Reinforcement Learning
Generalized Precision Matrix for Scalable Estimation of Nonparametric Markov Networks
Learning to Jointly Share and Prune Weights for Grounding Based Vision and Language Models
Domain Generalisation via Domain Adaptation: An Adversarial Fourier Amplitude Approach
GReTo: Remedying dynamic graph topology-task discordance via target homophily
Everybody Needs Good Neighbours: An Unsupervised Locality-based Method for Bias Mitigation
Combating Exacerbated Heterogeneity for Robust Models in Federated Learning
Improving Deep Regression with Ordinal Entropy
Language Modelling with Pixels
Bitrate-Constrained DRO: Beyond Worst Case Robustness To Unknown Group Shifts
Neural Architecture Design and Robustness: A Dataset
Is Adversarial Training Really a Silver Bullet for Mitigating Data Poisoning?
Policy Pre-training for Autonomous Driving via Self-supervised Geometric Modeling
AutoTransfer: AutoML with Knowledge Transfer - An Application to Graph Neural Networks
The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks
Decentralized Optimistic Hyperpolicy Mirror Descent: Provably No-Regret Learning in Markov Games
Exploring the Limits of Differentially Private Deep Learning with Group-wise Clipping
HypeR: Multitask Hyper-Prompted Training Enables Large-Scale Retrieval Generalization
Learning without Prejudices: Continual Unbiased Learning via Benign and Malignant Forgetting
Selective Annotation Makes Language Models Better Few-Shot Learners
Switch-NeRF: Learning Scene Decomposition with Mixture of Experts for Large-scale Neural Radiance Fields
Scaling Forward Gradient With Local Losses
NORM: Knowledge Distillation via N-to-One Representation Matching
Critic Sequential Monte Carlo
(Certified!!) Adversarial Robustness for Free!
Measuring Forgetting of Memorized Training Examples
Multi-lingual Evaluation of Code Generation Models
wav2tok: Deep Sequence Tokenizer for Audio Retrieval
Deep Learning meets Nonparametric Regression: Are Weight-Decayed DNNs Locally Adaptive?
Robust Active Distillation
SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation
Joint Edge-Model Sparse Learning is Provably Efficient for Graph Neural Networks
Near-optimal Policy Identification in Active Reinforcement Learning
Spherical Sliced-Wasserstein
InPL: Pseudo-labeling the Inliers First for Imbalanced Semi-supervised Learning
Subquadratic Algorithms for Kernel Matrices via Kernel Density Estimation
MixPro: Data Augmentation with MaskMix and Progressive Attention Labeling for Vision Transformer
Linear Convergence of Natural Policy Gradient Methods with Log-Linear Policies
SP2 : A Second Order Stochastic Polyak Method
CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks
Continual Transformers: Redundancy-Free Attention for Online Inference
Dirichlet-based Uncertainty Calibration for Active Domain Adaptation
Accurate Image Restoration with Attention Retractable Transformer
Predicting Cellular Responses with Variational Causal Inference and Refined Relational Information
Self-Supervised Set Representation Learning for Unsupervised Meta-Learning
Causal Representation Learning for Instantaneous and Temporal Effects in Interactive Systems
Visual Imitation Learning with Patch Rewards
CodeT: Code Generation with Generated Tests
Learning to Generate Columns with Application to Vertex Coloring
Interaction-Based Disentanglement of Entities for Object-Centric World Models
Learning where and when to reason in neuro-symbolic inference
On the Usefulness of Embeddings, Clusters and Strings for Text Generation Evaluation
StrucTexTv2: Masked Visual-Textual Prediction for Document Image Pre-training
Plateau in Monotonic Linear Interpolation --- A "Biased" View of Loss Landscape for Deep Networks
Improving Deep Policy Gradients with Value Function Search
Depth Separation with Multilayer Mean-Field Networks
Crossformer: Transformer Utilizing Cross-Dimension Dependency for Multivariate Time Series Forecasting
Individual Privacy Accounting with Gaussian Differential Privacy
Non-parametric Outlier Synthesis
General Neural Gauge Fields
Generate rather than Retrieve: Large Language Models are Strong Context Generators
Discovering Informative and Robust Positives for Video Domain Adaptation
Understanding Why Generalized Reweighting Does Not Improve Over ERM
Gradient-Guided Importance Sampling for Learning Binary Energy-Based Models
Neural Lagrangian Schr\"{o}dinger Bridge: Diffusion Modeling for Population Dynamics
Is Conditional Generative Modeling all you need for Decision Making?
Fair Attribute Completion on Graph with Missing Attributes
Planning with Sequence Models through Iterative Energy Minimization
Composing Ensembles of Pre-trained Models via Iterative Consensus
Deep Ranking Ensembles for Hyperparameter Optimization
Robustness to corruption in pre-trained Bayesian neural networks
Neural Groundplans: Persistent Neural Scene Representations from a Single Image
Improved Training of Physics-Informed Neural Networks Using Energy-Based Priors: a Study on Electrical Impedance Tomography
Disentanglement with Biological Constraints: A Theory of Functional Cell Types
ERL-Re$^2$: Efficient Evolutionary Reinforcement Learning with Shared State Representation and Individual Policy Representation
Towards Understanding Why Mask Reconstruction Pretraining Helps in Downstream Tasks
Continual Unsupervised Disentangling of Self-Organizing Representations
Accelerating Guided Diffusion Sampling with Splitting Numerical Methods
Thalamus: a brain-inspired algorithm for biologically-plausible continual learning and disentangled representations
LiftedCL: Lifting Contrastive Learning for Human-Centric Perception
SIMPLE: A Gradient Estimator for k-Subset Sampling
Modeling content creator incentives on algorithm-curated platforms
Deep Variational Implicit Processes
Estimating individual treatment effects under unobserved confounding using binary instruments
Approximate Bayesian Inference with Stein Functional Variational Gradient Descent
Distributional Meta-Gradient Reinforcement Learning
Learning to Linearize Deep Neural Networks for Secure and Efficient Private Inference
Denoising Diffusion Error Correction Codes
Temperature Schedules for self-supervised contrastive methods on long-tail data
Meta Knowledge Condensation for Federated Learning
ManiSkill2: A Unified Benchmark for Generalizable Manipulation Skills
Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning
Neuro-Symbolic Procedural Planning with Commonsense Prompting
Learning Object-Language Alignments for Open-Vocabulary Object Detection
Time to augment self-supervised visual representation learning
Learning Sparse and Low-Rank Priors for Image Recovery via Iterative Reweighted Least Squares Minimization
Chasing All-Round Graph Representation Robustness: Model, Training, and Optimization
Personalized Federated Learning with Feature Alignment and Classifier Collaboration
Adversarial Attacks on Adversarial Bandits
Offline Reinforcement Learning via High-Fidelity Generative Behavior Modeling
Improving Object-centric Learning with Query Optimization
Phase transition for detecting a small community in a large network
Bi-level Physics-Informed Neural Networks for PDE Constrained Optimization using Broyden's Hypergradients
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection
Noise Is Not the Main Factor Behind the Gap Between Sgd and Adam on Transformers, But Sign Descent Might Be
Transformer-Patcher: One Mistake Worth One Neuron
Are More Layers Beneficial to Graph Transformers?
Searching Lottery Tickets in Graph Neural Networks: A Dual Perspective
Improving Out-of-distribution Generalization with Indirection Representations
Bort: Towards Explainable Neural Networks with Bounded Orthogonal Constraint
The Power of Regularization in Solving Extensive-Form Games
S-NeRF: Neural Radiance Fields for Street Views
Cycle-consistent Masked AutoEncoder for Unsupervised Domain Generalization
CFlowNets: Continuous Control with Generative Flow Networks
Limitless Stability for Graph Convolutional Networks
DBQ-SSD: Dynamic Ball Query for Efficient 3D Object Detection
Differentiable Gaussianization Layers for Inverse Problems Regularized by Deep Generative Models
Exploring Low-Rank Property in Multiple Instance Learning for Whole Slide Image Classification
Breaking Correlation Shift via Conditional Invariant Regularizer
Towards One-shot Neural Combinatorial Solvers: Theoretical and Empirical Notes on the Cardinality-Constrained Case
In-context Reinforcement Learning with Algorithm Distillation
Your Contrastive Learning Is Secretly Doing Stochastic Neighbor Embedding
Divide to Adapt: Mitigating Confirmation Bias for Domain Adaptation of Black-Box Predictors
Block and Subword-Scaling Floating-Point (BSFP) : An Efficient Non-Uniform Quantization For Low Precision Inference
Semi-supervised Community Detection via Structural Similarity Metrics
Learning Symbolic Models for Graph-structured Physical Mechanism
DDM$^2$: Self-Supervised Diffusion MRI Denoising with Generative Diffusion Models
FiT: Parameter Efficient Few-shot Transfer Learning for Personalized and Federated Image Classification
Hard-Meta-Dataset++: Towards Understanding Few-Shot Performance on Difficult Tasks
Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased
Multivariate Time-series Imputation with Disentangled Temporal Representations
Evidential Uncertainty and Diversity Guided Active Learning for Scene Graph Generation
Automating Nearest Neighbor Search Configuration with Constrained Optimization
OTOv2: Automatic, Generic, User-Friendly
Unified Discrete Diffusion for Simultaneous Vision-Language Generation
Win: Weight-Decay-Integrated Nesterov Acceleration for Adaptive Gradient Algorithms
Self-Distillation for Further Pre-training of Transformers
Statistical Inference for Fisher Market Equilibrium
Visual Recognition with Deep Nearest Centroids
LPT: Long-tailed Prompt Tuning for Image Classification
DamoFD: Digging into Backbone Design on Face Detection
Prompting GPT-3 To Be Reliable
Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation
Spikformer: When Spiking Neural Network Meets Transformer
Multimodal Analogical Reasoning over Knowledge Graphs
Voxurf: Voxel-based Efficient and Accurate Neural Surface Reconstruction
Conditional Positional Encodings for Vision Transformers
Guarded Policy Optimization with Imperfect Online Demonstrations
Contrastive Learning Can Find An Optimal Basis For Approximately View-Invariant Functions
Revisiting the Entropy Semiring for Neural Speech Recognition
Rethinking skip connection model as a learnable Markov chain
Measuring axiomatic soundness of counterfactual image models
Alternating Differentiation for Optimization Layers
Out-of-distribution Detection with Implicit Outlier Transformation
Extracting Robust Models with Uncertain Examples
Stochastic Differentially Private and Fair Learning
Volumetric Optimal Transportation by Fast Fourier Transform
Hierarchical Relational Learning for Few-Shot Knowledge Graph Completion
The Devil is in the Wrongly-classified Samples: Towards Unified Open-set Recognition
MCAL: Minimum Cost Human-Machine Active Labeling
Learnable Topological Features For Phylogenetic Inference via Graph Neural Networks
Rotamer Density Estimator is an Unsupervised Learner of the Effect of Mutations on Protein-Protein Interaction
Bit-Pruning: A Sparse Multiplication-Less Dot-Product
A Stable and Scalable Method for Solving Initial Value PDEs with Neural Networks
IS SYNTHETIC DATA FROM GENERATIVE MODELS READY FOR IMAGE RECOGNITION?
Learning Domain-Agnostic Representation for Disease Diagnosis
BEVDistill: Cross-Modal BEV Distillation for Multi-View 3D Object Detection
A Multi-Grained Self-Interpretable Symbolic-Neural Model For Single/Multi-Labeled Text Classification
Suppressing the Heterogeneity: A Strong Feature Extractor for Few-shot Segmentation
Achieve the Minimum Width of Neural Networks for Universal Approximation
UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question Answering Over Knowledge Graph
On amortizing convex conjugates for optimal transport
Discovering Evolution Strategies via Meta-Black-Box Optimization
DualAfford: Learning Collaborative Visual Affordance for Dual-gripper Manipulation
SIMPLE: Specialized Model-Sample Matching for Domain Generalization
Contextual Image Masking Modeling via Synergized Contrasting without View Augmentation for Faster and Better Visual Pretraining
TDR-CL: Targeted Doubly Robust Collaborative Learning for Debiased Recommendations
Patch-Level Contrasting without Patch Correspondence for Accurate and Dense Contrastive Representation Learning
Continuous-Discrete Convolution for Geometry-Sequence Modeling in Proteins
Phenaki: Variable Length Video Generation from Open Domain Textual Descriptions
Human-level Atari 200x faster
Least-to-Most Prompting Enables Complex Reasoning in Large Language Models
Molecule Generation For Target Protein Binding with Structural Motifs
Neural Collapse Inspired Feature-Classifier Alignment for Few-Shot Class-Incremental Learning
Towards Robust Object Detection Invariant to Real-World Domain Shifts
Generating Diverse Cooperative Agents by Learning Incompatible Policies
Information-Theoretic Analysis of Unsupervised Domain Adaptation
Effects of Graph Convolutions in Multi-layer Networks
Diffusion Posterior Sampling for General Noisy Inverse Problems
Ask Me Anything: A simple strategy for prompting language models
Multi-skill Mobile Manipulation for Object Rearrangement
Post-hoc Concept Bottleneck Models
Neural Image-based Avatars: Generalizable Radiance Fields for Human Avatar Modeling
Corrupted Image Modeling for Self-Supervised Visual Pre-Training
Deep Learning on Implicit Neural Representations of Shapes
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language
Clean-image Backdoor: Attacking Multi-label Models with Poisoned Labels Only
Neural DAG Scheduling via One-Shot Priority Sampling
Efficient recurrent architectures through activity sparsity and sparse back-propagation through time
Provably Auditing Ordinary Least Squares in Low Dimensions
On Accelerated Perceptrons and Beyond
EVA3D: Compositional 3D Human Generation from 2D Image Collections
Single-shot General Hyper-parameter Optimization for Federated Learning
DocPrompting: Generating Code by Retrieving the Docs
The Surprising Effectiveness of Equivariant Models in Domains with Latent Symmetry
Fooling SHAP with Stealthily Biased Sampling
View Synthesis with Sculpted Neural Points
On Pre-training Language Model for Antibody
Represent to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency
Efficient Attention via Control Variates
CUDA: Curriculum of Data Augmentation for Long-tailed Recognition
Spacetime Representation Learning
Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation
DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking
Mind's Eye: Grounded Language Model Reasoning through Simulation
Code Translation with Compiler Representations
Learnable Behavior Control: Breaking Atari Human World Records via Sample-Efficient Behavior Selection
Phase2vec: dynamical systems embedding with a physics-informed convolutional network
GPViT: A High Resolution Non-Hierarchical Vision Transformer with Group Propagation
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow
Learning on Large-scale Text-attributed Graphs via Variational Inference
Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics
Theoretical Characterization of the Generalization Performance of Overfitted Meta-Learning
StableDR: Stabilized Doubly Robust Learning for Recommendation on Data Missing Not at Random
ACMP: Allen-Cahn Message Passing with Attractive and Repulsive Forces for Graph Neural Networks
Fundamental Limits in Formal Verification of Message-Passing Neural Networks
Generative Modelling with Inverse Heat Dissipation
Improving the imputation of missing data with Markov Blanket discovery
Classically Approximating Variational Quantum Machine Learning with Random Fourier Features
Evolve Smoothly, Fit Consistently: Learning Smooth Latent Dynamics For Advection-Dominated Systems
Unmasking the Lottery Ticket Hypothesis: What's Encoded in a Winning Ticket's Mask?
Universal Few-shot Learning of Dense Prediction Tasks with Visual Token Matching
Is Reinforcement Learning (Not) for Natural Language Processing: Benchmarks, Baselines, and Building Blocks for Natural Language Policy Optimization
Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier
Recitation-Augmented Language Models
PowerQuant: Automorphism Search for Non-Uniform Quantization
Powderworld: A Platform for Understanding Generalization via Rich Task Distributions
Fisher-Legendre (FishLeg) optimization of deep neural networks
Pseudo-label Training and Model Inertia in Neural Machine Translation
Choreographer: Learning and Adapting Skills in Imagination
Blurring Diffusion Models
MultiViz: Towards Visualizing and Understanding Multimodal Models
LAVA: Data Valuation without Pre-Specified Learning Algorithms
Modeling the Data-Generating Process is Necessary for Out-of-Distribution Generalization
Neuromechanical Autoencoders: Learning to Couple Elastic and Neural Network Nonlinearity
NeRN: Learning Neural Representations for Neural Networks
Proposal-Contrastive Pretraining for Object Detection from Fewer Data
PAC Reinforcement Learning for Predictive State Representations
Unsupervised Manifold Alignment with Joint Multidimensional Scaling
Sampling is as easy as learning the score: theory for diffusion models with minimal data assumptions
Conditional Antibody Design as 3D Equivariant Graph Translation
From Play to Policy: Conditional Behavior Generation from Uncurated Robot Data
A CMDP-within-online framework for Meta-Safe Reinforcement Learning
Relational Attention: Generalizing Transformers for Graph-Structured Tasks
Making Better Decision by Directly Planning in Continuous Control
Training language models to summarize narratives improves brain alignment
Rarity Score : A New Metric to Evaluate the Uncommonness of Synthesized Images
Neuroevolution is a Competitive Alternative to Reinforcement Learning for Skill Discovery
MARS: Meta-learning as Score Matching in the Function Space
BALTO: fast tensor program optimization with diversity-based active learning
Pseudoinverse-Guided Diffusion Models for Inverse Problems
Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers
HiViT: A Simpler and More Efficient Design of Hierarchical Vision Transformer
Moving Forward by Moving Backward: Embedding Action Impact over Action Semantics
Spatio-temporal point processes with deep non-stationary kernels
Modeling Sequential Sentence Relation to Improve Cross-lingual Dense Retrieval
Learning a Data-Driven Policy Network for Pre-Training Automated Feature Engineering
$O(T^{-1})$ Convergence of Optimistic-Follow-the-Regularized-Leader in Two-Player Zero-Sum Markov Games
Towards the Generalization of Contrastive Self-Supervised Learning
Deterministic training of generative autoencoders using invertible layers
Canary in a Coalmine: Better Membership Inference with Ensembled Adversarial Queries
Learning rigid dynamics with face interaction graph networks
Few-shot Cross-domain Image Generation via Inference-time Latent-code Learning
On the Sensitivity of Reward Inference to Misspecified Human Models
ArCL: Enhancing Contrastive Learning with Augmentation-Robust Representations
Pink Noise Is All You Need: Colored Noise Exploration in Deep Reinforcement Learning
Understanding and Adopting Rational Behavior by Bellman Score Estimation
SMART: Self-supervised Multi-task pretrAining with contRol Transformers
Learning with Stochastic Orders
LightGCL: Simple Yet Effective Graph Contrastive Learning for Recommendation
Extremely Simple Activation Shaping for Out-of-Distribution Detection
Extreme Q-Learning: MaxEnt RL without Entropy
Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic Graphs
Learning About Progress From Experts
Nonlinear Reconstruction for Operator Learning of PDEs with Discontinuities
Fast and Precise: Adjusting Planning Horizon with Adaptive Subgoal Search
Git Re-Basin: Merging Models modulo Permutation Symmetries
SimPer: Simple Self-Supervised Learning of Periodic Targets
No Reason for No Supervision: Improved Generalization in Supervised Models
Towards Effective and Interpretable Human-Agent Collaboration in MOBA Games: A Communication Perspective
EV-GAN: Simulation of extreme events with ReLU neural networks
Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training
Topologically penalized regression on manifolds
Principal Components Bias in Over-parameterized Linear Models, and its Manifestation in Deep Neural Networks
Composite Slice Transformer: An Efficient Transformer with Composition of Multi-Scale Multi-Range Attentions
OPTQ: Accurate Quantization for Generative Pre-trained Transformers
Faster Last-iterate Convergence of Policy Optimization in Zero-Sum Markov Games
Learning Controllable Adaptive Simulation for Multi-resolution Physics
The Tilted Variational Autoencoder: Improving Out-of-Distribution Detection
Constraining Representations Yields Models That Know What They Don't Know
Faster federated optimization under second-order similarity
Diffusion Models for Causal Discovery via Topological Ordering
Deconstructing Distributions: A Pointwise Framework of Learning
Direct Embedding of Temporal Network Edges via Time-Decayed Line Graphs
Gray-Box Gaussian Processes for Automated Reinforcement Learning
Machine Unlearning of Federated Clusters
Making Substitute Models More Bayesian Can Enhance Transferability of Adversarial Examples
MoDem: Accelerating Visual Model-Based Reinforcement Learning with Demonstrations
Understanding Embodied Reference with Touch-Line Transformer
Semi-supervised learning with a principled likelihood from a generative model of data curation
The hidden uniform cluster prior in self-supervised learning
STaSy: Score-based Tabular data Synthesis
Near-optimal Coresets for Robust Clustering
Learning Diffusion Bridges on Constrained Domains
Bayesian Oracle for bounding information gain in neural encoding models
Scaling Pareto-Efficient Decision Making via Offline Multi-Objective RL
On Representing Linear Programs by Graph Neural Networks
SLTUNET: A Simple Unified Model for Sign Language Translation
Confidential-PROFITT: Confidential PROof of FaIr Training of Trees
A Convergent Single-Loop Algorithm for Relaxation of Gromov-Wasserstein in Graph Data
Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners
Understanding Neural Coding on Latent Manifolds by Sharing Features and Dividing Ensembles
Model-based Causal Bayesian Optimization
Multi-objective optimization via equivariant deep hypervolume approximation
ExpressivE: A Spatio-Functional Embedding For Knowledge Graph Completion
Why adversarial training can hurt robust accuracy
Planning with Large Language Models for Code Generation
Where to Diffuse, How to Diffuse, and How to Get Back: Automated Learning for Multivariate Diffusions
EPISODE: Episodic Gradient Clipping with Periodic Resampled Corrections for Federated Learning with Heterogeneous Data
VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training
How Does Semi-supervised Learning with Pseudo-labelers Work? A Case Study
Robust Scheduling with GFlowNets
Generating Sequences by Learning to Self-Correct
Feature selection and low test error in shallow low-rotation ReLU networks
A probabilistic framework for task-aligned intra- and inter-area neural manifold estimation
Behavior Proximal Policy Optimization
Bayes-MIL: A New Probabilistic Perspective on Attention-based Multiple Instance Learning for Whole Slide Images
Performance Bounds for Model and Policy Transfer in Hidden-parameter MDPs
Distributed Extra-gradient with Optimal Complexity and Communication Guarantees
Characteristic Neural Ordinary Differential Equation
Rhino: Deep Causal Temporal Relationship Learning with History-dependent Noise
Can We Find Nash Equilibria at a Linear Rate in Markov Games?
Causal Reasoning in the Presence of Latent Confounders via Neural ADMG Learning
Understanding DDPM Latent Codes Through Optimal Transport
Scaling up and Stabilizing Differentiable Planning with Implicit Differentiation
A Theory of Dynamic Benchmarks
Language models are multilingual chain-of-thought reasoners
FIGARO: Controllable Music Generation using Learned and Expert Features
Real-time variational method for learning neural trajectory and its dynamics
Label Propagation with Weak Supervision
MICN: Multi-scale Local and Global Context Modeling for Long-term Series Forecasting
f-DM: A Multi-stage Diffusion Model via Progressive Signal Transformation
Interpretable Geometric Deep Learning via Learnable Randomness Injection
Denoising Diffusion Samplers
Panning for Gold in Federated Learning: Targeted Text Extraction under Arbitrarily Large-Scale Aggregation
Model ensemble instead of prompt fusion: a sample-specific knowledge transfer method for few-shot prompt tuning
Learning Iterative Neural Optimizers for Image Steganography
Decomposed Prompting: A Modular Approach for Solving Complex Tasks
A Control-Centric Benchmark for Video Prediction
Enhancing Meta Learning via Multi-Objective Soft Improvement Functions
Pessimism in the Face of Confounders: Provably Efficient Offline Reinforcement Learning in Partially Observable Markov Decision Processes
Memorization Capacity of Neural Networks with Conditional Computation
Graph Domain Adaptation via Theory-Grounded Spectral Regularization
Characterizing intrinsic compositionality in transformers with Tree Projections
Stateful Active Facilitator: Coordination and Environmental Heterogeneity in Cooperative Multi-Agent Reinforcement Learning
Recursive Time Series Data Augmentation
Fast Nonlinear Vector Quantile Regression
Unveiling the sampling density in non-uniform geometric graphs
Test-Time Robust Personalization for Federated Learning
Solving stochastic weak Minty variational inequalities without increasing batch size
QAID: Question Answering Inspired Few-shot Intent Detection
Towards Addressing Label Skews in One-Shot Federated Learning
Contextual Convolutional Networks
Scenario-based Question Answering with Interacting Contextual Properties
EAGLE: Large-scale Learning of Turbulent Fluid Dynamics with Mesh Transformers
Neural ePDOs: Spatially Adaptive Equivariant Partial Differential Operator Based Networks
Learning Fair Graph Representations via Automated Data Augmentations
Learning Hierarchical Protein Representations via Complete 3D Graph Networks
Computing all Optimal Partial Transports
Learning Vortex Dynamics for Fluid Inference and Prediction
Analog Bits: Generating Discrete Data using Diffusion Models with Self-Conditioning
Filter-Recovery Network for Multi-Speaker Audio-Visual Speech Separation
Function-Consistent Feature Distillation
Decompose to Generalize: Species-Generalized Animal Pose Estimation
Scaling Up Probabilistic Circuits by Latent Variable Distillation
What shapes the loss landscape of self supervised learning?
Efficient approximation of neural population structure and correlations with probabilistic circuits
TypeT5: Seq2seq Type Inference using Static Analysis
Stay Moral and Explore: Learn to Behave Morally in Text-based Games
Improving Differentiable Neural Architecture Search by Encouraging Transferability
Auto-Encoding Goodness of Fit
Particle-based Variational Inference with Preconditioned Functional Gradient Flow
Markup-to-Image Diffusion Models with Scheduled Sampling
An Extensible Multi-modal Multi-task Object Dataset with Materials
DySR: Adaptive Super-Resolution via Algorithm and System Co-design
Can Neural Networks Learn Implicit Logic from Physical Reasoning?
ManyDG: Many-domain Generalization for Healthcare Applications
StyleMorph: Disentangled 3D-Aware Image Synthesis with a 3D Morphable StyleGAN
Packed Ensembles for efficient uncertainty estimation
Generalization and Estimation Error Bounds for Model-based Neural Networks
Towards convergence to Nash equilibria in two-team zero-sum games
Exploring Temporally Dynamic Data Augmentation for Video Recognition
Understanding Train-Validation Split in Meta-Learning with Neural Networks
Bispectral Neural Networks
A Learning Based Hypothesis Test for Harmful Covariate Shift
$k$NN Prompting: Beyond-Context Learning with Calibration-Free Nearest Neighbor Inference
Selective Frequency Network for Image Restoration
ImaginaryNet: Learning Object Detectors without Real Images and Annotations
Safe Reinforcement Learning From Pixels Using a Stochastic Latent Representation
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization
FINDE: Neural Differential Equations for Finding and Preserving Invariant Quantities
Offline RL for Natural Language Generation with Implicit Language Q Learning
Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning for Task-oriented Dialogue Systems
Human alignment of neural network representations
Risk-Aware Reinforcement Learning with Coherent Risk Measures and Non-linear Function Approximation
Self-supervision through Random Segments with Autoregressive Coding (RandSAC)
Efficiently Controlling Multiple Risks with Pareto Testing
DiffMimic: Efficient Motion Mimicking with Differentiable Physics
Causality Compensated Attention for Contextual Biased Visual Recognition
Sparse Mixture-of-Experts are Domain Generalizable Learners
Towards Better Selective Classification
Latent Bottlenecked Attentive Neural Processes
How Informative is the Approximation Error from Tensor Decomposition for Neural Network Compression?
Latent Graph Inference using Product Manifolds
Light Sampling Field and BRDF Representation for Physically-based Neural Rendering
FedExP: Speeding Up Federated Averaging via Extrapolation
Compositionality with Variation Reliably Emerges in Neural Networks
gDDIM: Generalized denoising diffusion implicit models
Fast Sampling of Diffusion Models with Exponential Integrator
Understanding The Robustness of Self-supervised Learning Through Topic Modeling
Interactive Portrait Harmonization
AIM: Adapting Image Models for Efficient Video Action Recognition
Parameter-Efficient Fine-Tuning Design Spaces
Not All Tasks Are Born Equal: Understanding Zero-Shot Generalization
Learning in temporally structured environments
Dilated convolution with learnable spacings
PEER: A Collaborative Language Model
Maximizing Spatio-Temporal Entropy of Deep 3D CNNs for Efficient Video Recognition
Formal Mathematics Statement Curriculum Learning
D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory
Subsampling in Large Graphs Using Ricci Curvature
$\Lambda$-DARTS: Mitigating Performance Collapse by Harmonizing Operation Selection among Cells
UL2: Unifying Language Learning Paradigms
STREET: A MULTI-TASK STRUCTURED REASONING AND EXPLANATION BENCHMARK
Efficient Certified Training and Robustness Verification of Neural ODEs
Active Learning for Object Detection with Evidential Deep Learning and Hierarchical Uncertainty Aggregation
Diffusion Probabilistic Fields
Globally Injective ReLU Networks
SCALE-UP: An Efficient Black-box Input-level Backdoor Detection via Analyzing Scaled Prediction Consistency
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences
Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement
SemPPL: Predicting Pseudo-Labels for Better Contrastive Representations
Distributionally Robust Recourse Action
Bidirectional Propagation for Cross-Modal 3D Object Detection
On the Effectiveness of Out-of-Distribution Data in Self-Supervised Long-Tail Learning.
Efficient Deep Reinforcement Learning Requires Regulating Overfitting
Using Both Demonstrations and Language Instructions to Efficiently Learn Robotic Tasks
Learning to Segment from Noisy Annotations: A Spatial Correction Approach
Diffusion Adversarial Representation Learning for Self-supervised Vessel Segmentation
DynaMS: Dyanmic Margin Selection for Efficient Deep Learning
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers
Mid-Vision Feedback
HiCLIP: Contrastive Language-Image Pretraining with Hierarchy-aware Attention
Learning Rationalizable Equilibria in Multiplayer Games
Revisiting Intrinsic Reward for Exploration in Procedurally Generated Environments
Deep Learning From Crowdsourced Labels: Coupled Cross-Entropy Minimization, Identifiability, and Regularization
Efficient Federated Domain Translation
Diversify and Disambiguate: Out-of-Distribution Robustness via Disagreement
Agnostic Learning of General ReLU Activation Using Gradient Descent
Latent Variable Representation for Reinforcement Learning
Learning Probabilistic Topological Representations Using Discrete Morse Theory
Spectral Decomposition Representation for Reinforcement Learning
Value Memory Graph: A Graph-Structured World Model for Offline Reinforcement Learning
Robust Multivariate Time-Series Forecasting: Adversarial Attacks and Defense Mechanisms
MMVAE+: Enhancing the Generative Quality of Multimodal VAEs without Compromises
Causal Confusion and Reward Misidentification in Preference-Based Reward Learning
TaskPrompter: Spatial-Channel Multi-Task Prompting for Dense Scene Understanding
Open-Vocabulary Object Detection upon Frozen Vision and Language Models
Solving Continuous Control via Q-learning
Mutual Partial Label Learning with Competitive Label Noise
Partial Label Unsupervised Domain Adaptation with Class-Prototype Alignment
Max-Margin Works while Large Margin Fails: Generalization without Uniform Convergence
Implicit Regularization for Group Sparsity
A Generalized Projected Bellman Error for Off-policy Value Estimation in Reinforcement Learning
Martingale Posterior Neural Processes
ESD: Expected Squared Difference as a Tuning-Free Trainable Calibration Measure
Red PANDA: Disambiguating Image Anomaly Detection by Removing Nuisance Factors
Identifiability Results for Multimodal Contrastive Learning
Weakly Supervised Knowledge Transfer with Probabilistic Logical Reasoning for Object Detection
Learning Group Importance using the Differentiable Hypergeometric Distribution
PAC-NeRF: Physics Augmented Continuum Neural Radiance Fields for Geometry-Agnostic System Identification
A View From Somewhere: Human-Centric Face Representations
PaLI: A Jointly-Scaled Multilingual Language-Image Model
Learning Uncertainty for Unknown Domains with Zero-Target-Assumption
Weakly Supervised Explainable Phrasal Reasoning with Neural Fuzzy Logic
Self-Supervised Geometric Correspondence for Category-Level 6D Object Pose Estimation in the Wild
Federated Learning as Variational Inference: A Scalable Expectation Propagation Approach
MPCFORMER: FAST, PERFORMANT AND PRIVATE TRANSFORMER INFERENCE WITH MPC
Building Normalizing Flows with Stochastic Interpolants
Clifford Neural Layers for PDE Modeling
Learning to Compose Soft Prompts for Compositional Zero-Shot Learning
LogicDP: Creating Labels for Graph Data via Inductive Logic Programming
Maximizing Communication Efficiency for Large-scale Training via 0/1 Adam
Temporal Coherent Test Time Optimization for Robust Video Classification
Multi-Objective Online Learning
Provable Robustness against Wasserstein Distribution Shifts via Input Randomization
CLIPSep: Learning Text-queried Sound Separation with Noisy Unlabeled Videos
On the Performance of Temporal Difference Learning With Neural Networks
Exploring and Exploiting Decision Boundary Dynamics for Adversarial Robustness
Scaling Laws for a Multi-Agent Reinforcement Learning Model
Transfer Learning with Deep Tabular Models
Anti-Symmetric DGN: a stable architecture for Deep Graph Networks
Protein Sequence and Structure Co-Design with Equivariant Translation
Differentially Private Adaptive Optimization with Delayed Preconditioners
SMART: Sentences as Basic Units for Text Evaluation
Fairness and Accuracy under Domain Generalization
Mitigating Dataset Bias by Using Per-Sample Gradient
Gradient Boosting Performs Gaussian Process Inference
A critical look at the evaluation of GNNs under heterophily: Are we really making progress?
Provably Efficient Lifelong Reinforcement Learning with Linear Representation
Confidence Estimation Using Unlabeled Data
Multi-Rate VAE: Train Once, Get the Full Rate-Distortion Curve
Diffusion-based Image Translation using disentangled style and content representation
Competitive Physics Informed Networks
Learnable Graph Convolutional Attention Networks
On the Data-Efficiency with Contrastive Image Transformation in Reinforcement Learning
Unbiased Supervised Contrastive Learning
Interneurons accelerate learning dynamics in recurrent neural networks for statistical adaptation
Bridging the Gap to Real-World Object-Centric Learning
Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning
Understanding Zero-shot Adversarial Robustness for Large-Scale Models
TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis
3D generation on ImageNet
TTN: A Domain-Shift Aware Batch Normalization in Test-Time Adaptation
Winning Both the Accuracy of Floating Point Activation and the Simplicity of Integer Arithmetic
Sparse Distributed Memory is a Continual Learner
Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation
Symmetric Pruning in Quantum Neural Networks
On The Inadequacy of Optimizing Alignment and Uniformity in Contrastive Learning of Sentence Representations
GNNDelete: A General Strategy for Unlearning in Graph Neural Networks
Fundamental limits on the robustness of image classifiers
CodeBPE: Investigating Subtokenization Options for Large Language Model Pretraining on Source Code
Leveraging Unlabeled Data to Track Memorization
Learning to reason over visual objects
Weighted Ensemble Self-Supervised Learning
Agent-based Graph Neural Networks
Tuning Frequency Bias in Neural Network Training with Nonuniform Data
Treeformer: Dense Gradient Trees for Efficient Attention Computation
The Lazy Neuron Phenomenon: On Emergence of Activation Sparsity in Transformers
Augmentation with Projection: Towards an Effective and Efficient Data Augmentation Paradigm for Distillation
Protein Representation Learning by Geometric Structure Pretraining
DAG Matters! GFlowNets Enhanced Explainer for Graph Neural Networks
Don’t forget the nullspace! Nullspace occupancy as a mechanism for out of distribution failure
Uniform-in-time propagation of chaos for the mean-field gradient Langevin dynamics
Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student Settings and its Superiority to Kernel Methods
Confidence-Based Feature Imputation for Graphs with Partially Known Features
Imitating Graph-Based Planning with Goal-Conditioned Policies
Long-Tailed Learning Requires Feature Learning
FedDAR: Federated Domain-Aware Representation Learning
FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning
Factorized Fourier Neural Operators
Variational Latent Branching Model for Off-Policy Evaluation
3D Equivariant Diffusion for Target-Aware Molecule Generation and Affinity Prediction
Transformer-based model for symbolic regression via joint supervised learning
LS-IQ: Implicit Reward Regularization for Inverse Reinforcement Learning
Share Your Representation Only: Guaranteed Improvement of the Privacy-Utility Tradeoff in Federated Learning
Mini-batch $k$-means terminates within $O(d/\epsilon)$ iterations
Disentanglement of Correlated Factors via Hausdorff Factorized Support
An efficient encoder-decoder architecture with top-down attention for speech separation
Specformer: Spectral Graph Neural Networks Meet Transformers
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
ZiCo: Zero-shot NAS via inverse Coefficient of Variation on Gradients
Cross-Layer Retrospective Retrieving via Layer Attention
Decision S4: Efficient Sequence-Based RL via State Spaces Layers
Easy Differentially Private Linear Regression
Contextual bandits with concave rewards, and an application to fair ranking
Can BERT Refrain from Forgetting on Sequential Tasks? A Probing Study
A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning
Mole-BERT: Rethinking Pre-training Graph Neural Networks for Molecules
DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training
Latent State Marginalization as a Low-cost Approach for Improving Exploration
Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data
GFlowNets and variational inference
Leveraging Large Language Models for Multiple Choice Question Answering
Understanding the Covariance Structure of Convolutional Filters
Regression with Label Differential Privacy
E3Bind: An End-to-End Equivariant Network for Protein-Ligand Docking
WiNeRT: Towards Neural Ray Tracing for Wireless Channel Modelling and Differentiable Simulations
MA-BERT: Towards Matrix Arithmetic-only BERT Inference by Eliminating Complex Non-Linear Functions
An Exact Poly-Time Membership-Queries Algorithm for Extracting a Three-Layer ReLU Network
On the Robustness to Misspecification of α-posteriors and Their Variational Approximations
Binding Language Models in Symbolic Languages
ContraNorm: A Contrastive Learning Perspective on Oversmoothing and Beyond
Compositional Semantic Parsing with Large Language Models
Coupled Multiwavelet Operator Learning for Coupled Differential Equations
POPGym: Benchmarking Partially Observable Reinforcement Learning
Rethinking the Effect of Data Augmentation in Adversarial Contrastive Learning
TrojText: Test-time Invisible Textual Trojan Insertion
Transferable Unlearnable Examples
Error Sensitivity Modulation based Experience Replay: Mitigating Abrupt Representation Drift in Continual Learning
Modelling Long Range Dependencies in $N$D: From Task-Specific to a General Purpose CNN
Task-Aware Information Routing from Common Representation Space in Lifelong Learning
GNNInterpreter: A Probabilistic Generative Model-Level Explanation for Graph Neural Networks
Pushing the Limits of Fewshot Anomaly Detection in Industry Vision: Graphcore
Combinatorial Pure Exploration of Causal Bandits
Pareto Invariant Risk Minimization: Towards Mitigating the Optimization Dilemma in Out-of-Distribution Generalization
What Makes Convolutional Models Great on Long Sequence Modeling?
Learning Multimodal Data Augmentation in Feature Space
Neural Systematic Binder
Active Image Indexing
How Much Data Are Augmentations Worth? An Investigation into Scaling Laws, Invariance, and Implicit Regularization
Learning Human-Compatible Representations for Case-Based Decision Support
Optimizing Spca-based Continual Learning: A Theoretical Approach
Decepticons: Corrupted Transformers Breach Privacy in Federated Learning for Language Models
The Best of Both Worlds: Accurate Global and Personalized Models through Federated Learning with Data-Free Hyper-Knowledge Distillation
Linear Connectivity Reveals Generalization Strategies
SCoMoE: Efficient Mixtures of Experts with Structured Communication
A view of mini-batch SGD via generating functions: conditions of convergence, phase transitions, benefit from negative momenta.
The Role of Coverage in Online Reinforcement Learning
PINTO: Faithful Language Reasoning Using Prompt-Generated Rationales
GEASS: Neural causal feature selection for high-dimensional biological data
SmartFRZ: An Efficient Training Framework using Attention-Based Layer Freezing
Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement
Augmentation Component Analysis: Modeling Similarity via the Augmentation Overlaps
Neural Bregman Divergences for Distance Learning
Incompatibility Clustering as a Defense Against Backdoor Poisoning Attacks
A Theoretical Understanding of Shallow Vision Transformers: Learning, Generalization, and Sample Complexity
ChiroDiff: Modelling chirographic data with Diffusion Models
Out-of-Distribution Detection and Selective Generation for Conditional Language Models
A Unified Framework for Soft Threshold Pruning
Federated Neural Bandits
Compositional Task Representations for Large Language Models
An Additive Instance-Wise Approach to Multi-class Model Interpretation
Editing models with task arithmetic
Reparameterization through Spatial Gradient Scaling
Quality-Similar Diversity via Population Based Reinforcement Learning
Indiscriminate Poisoning Attacks on Unsupervised Contrastive Learning
Adaptive Optimization in the $\infty$-Width Limit
Surgical Fine-Tuning Improves Adaptation to Distribution Shifts
Avoiding spurious correlations via logit correction
Interpretability with full complexity by constraining feature information
User-Interactive Offline Reinforcement Learning
Safe Exploration Incurs Nearly No Additional Sample Complexity for Reward-Free RL
Large Language Models are Human-Level Prompt Engineers
Pruning Deep Neural Networks from a Sparsity Perspective
Spotlight: Mobile UI Understanding using Vision-Language Models with a Focus
Dual Student Networks for Data-Free Model Stealing
Data-Free One-Shot Federated Learning Under Very High Statistical Heterogeneity
Mosaic Representation Learning for Self-supervised Visual Pre-training
A Theoretical Framework for Inference and Learning in Predictive Coding Networks
Variance Reduction is an Antidote to Byzantines: Better Rates, Weaker Assumptions and Communication Compression as a Cherry on the Top
Energy-Inspired Self-Supervised Pretraining for Vision Models
Effectively Modeling Time Series with Simple Discrete State Spaces
A Time Series is Worth 64 Words: Long-term Forecasting with Transformers
Random Laplacian Features for Learning with Hyperbolic Space
Replay Memory as An Empirical MDP: Combining Conservative Estimation with Experience Replay
$\mathrm{SE}(3)$-Equivariant Attention Networks for Shape Reconstruction in Function Space
Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought
Momentum Stiefel Optimizer, with Applications to Suitably-Orthogonal Attention, and Optimal Transport
Time Will Tell: New Outlooks and A Baseline for Temporal Multi-View 3D Object Detection
CircNet: Meshing 3D Point Clouds with Circumcenter Detection
Robust Graph Dictionary Learning
Unsupervised visualization of image datasets using contrastive learning
Learning Harmonic Molecular Representations on Riemannian Manifold
Learning Language Representations with Logical Inductive Bias
First Steps Toward Understanding the Extrapolation of Nonlinear Models to Unseen Domains
Concept-level Debugging of Part-Prototype Networks
CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis
Computational Language Acquisition with Theory of Mind
Projective Proximal Gradient Descent for Nonconvex Nonsmooth Optimization: Fast Convergence Without Kurdyka-Lojasiewicz (KL) Property
Mega: Moving Average Equipped Gated Attention
Wasserstein Auto-encoded MDPs: Formal Verification of Efficiently Distilled RL Policies with Many-sided Guarantees
Prototypical Calibration for Few-shot Learning of Language Models
Serving Graph Compression for Graph Neural Networks
Learning MLPs on Graphs: A Unified View of Effectiveness, Robustness, and Efficiency
Geometrically regularized autoencoders for non-Euclidean data
Calibrating the Rigged Lottery: Making All Tickets Reliable
A new characterization of the edge of stability based on a sharpness measure aware of batch gradient distribution
VoGE: A Differentiable Volume Renderer using Gaussian Ellipsoids for Analysis-by-Synthesis
We use cookies to store which papers have been visited.
I agree
Successful Page Load
ICLR uses cookies for essential functions only. We do not sell your personal information.
Our Privacy Policy »
Accept Cookies
We use cookies to store which papers have been visited.
I agree