ICLR 2023 Wednesday 05/3

Timezone: Africa/Kigali

Full Schedule Mon 5/1 Tue 5/2 Wed 5/3 Thu 5/4 Fri 5/5

Registration Desk

Registration / Check-in

8:00 AM - 6:00 PM

Registration and Check-in are located in the lobby of the convention center near the Radisson entrance.

... more

Invited Talk

Dialogue Research in the Era of LLMs

Dilek Hakkani-Tur

8:30 AM - 9:30 AM

Recent large language models (LLMs) have enabled significant advancements for open-domain dialogue systems due to their ability to generate coherent natural language responses to any user request. Their ability to memorize and perform compositional reasoning has enabled accurate execution of dialogue related tasks, such as language understanding and response generation. However, these models suffer from limitations, such as, hallucination, undesired capturing of biases, difficulty in generalization to specific policies, and lack of interpretability.. To tackle these issues, the natural language processing community proposed methods, such as, injecting knowledge into language models during training or inference, retrieving related knowledge using multi-step inference and API/tools, and so on. In this talk, I plan to provide an overview of our and other work that aim to address these challenges.

... more

Speaker Bio

Dilek Hakkani-Tür is a senior principal scientist at Amazon Alexa AI focusing on enabling natural dialogues with machines. Prior to joining Amazon, she was leading the dialogue research group at Google (2016-2018), a principal researcher at Microsoft Research (2010-2016), International Computer Science Institute (ICSI, 2006-2010) and AT&T Labs-Research (2001-2005). She received her BSc degree from Middle East Technical Univ, in 1994, and MSc and PhD degrees from Bilkent Univ., Department of Computer Engineering, in 1996 and 2000, respectively. Her research interests include conversational AI, natural language and speech processing, spoken dialogue systems, and machine learning for language processing. She has over 80 patents that were granted and co-authored more than 300 papers in natural language and speech processing. She received several best paper awards for publications she co-authored on conversational systems, including her earlier work on active learning for dialogue systems, from IEEE Signal Processing Society, ISCA and EURASIP. She served as an associate editor for IEEE Transactions on Audio, Speech and Language Processing (2005-2008), member of the IEEE Speech and Language Technical Committee (2009-2014), area editor for speech and language processing for Elsevier's Digital Signal Processing Journal and IEEE Signal Processing Letters (2011-2013), and served on the ISCA Advisory Council (2015-2019). She also served as the Editor-in-Chief of the IEEE/ACM Transactions on Audio, Speech and Language Processing (2019-2021), an IEEE Distinguished Industry Speaker (2021) and is a fellow of the IEEE (2014) and ISCA (2014).

... more

Oral

Oral 5 Track 5: Deep Learning and representational learning & Reinforcement Learning

10:00 AM - 11:30 AM

8 Events in this session

Efficient recurrent architectures through activity sparsity and sparse back-propagation through time

Anand Subramoney · Khaleelulla Khan Nazeer · Mark Schoene · Christian Mayr · David Kappel

PLOT: Prompt Learning with Optimal Transport for Vision-Language Models

Guangyi Chen · Weiran Yao · Xiangchen Song · Xinyue Li · Yongming Rao · Kun Zhang

Aligning Model and Macaque Inferior Temporal Cortex Representations Improves Model-to-Human Behavioral Alignment and Adversarial Robustness

Joel Dapello · Kohitij Kar · Martin Schrimpf · Robert Geary · Michael Ferguson · David Cox · James DiCarlo

Implicit regularization in Heavy-ball momentum accelerated stochastic gradient descent

Avrajit Ghosh · HE LYU · Xitong Zhang · Rongrong Wang

A Primal-Dual Framework for Transformers and Neural Networks

TAN NGUYEN · Tam Nguyen · Nhat Ho · Andrea Bertozzi · Richard Baraniuk · Stanley J Osher

Learning with Logical Constraints but without Shortcut Satisfaction

Zenan Li · Zehua Liu · Yuan Yao · Jingwei Xu · Taolue Chen · Xiaoxing Ma · Jian Lu

No Reason for No Supervision: Improved Generalization in Supervised Models

Mert Bulent Sariyildiz · Yannis Kalantidis · Karteek Alahari · Diane Larlus

Generating Diverse Cooperative Agents by Learning Incompatible Policies

Rujikorn Charakorn · Poramate Manoonpong · Nat Dilokthanakul

Go to Event Page

Oral

Oral 5 Track 2: Optimization

10:00 AM - 11:30 AM

7 Events in this session

DASHA: Distributed Nonconvex Optimization with Communication Compression and Optimal Oracle Complexity

Alexander Tyurin · Peter Richtarik

Single-shot General Hyper-parameter Optimization for Federated Learning

Yi Zhou · Parikshit Ram · Theodoros Salonidis · Nathalie Baracaldo · Horst Samulowitz · Heiko Ludwig

Solving Constrained Variational Inequalities via a First-order Interior Point-based Method

Tong Yang · Michael Jordan · Tatjana Chavdarova

FedExP: Speeding Up Federated Averaging via Extrapolation

Divyansh Jhunjhunwala · Shiqiang Wang · Gauri Joshi

LMC: Fast Training of GNNs via Subgraph Sampling with Provable Convergence

Zhihao Shi · Xize Liang · Jie Wang

Multi-Objective Online Learning

Jiyan Jiang · Wenpeng Zhang · Shiji Zhou · Lihong Gu · Xiaodong Zeng · Wenwu Zhu

Continuous PDE Dynamics Forecasting with Implicit Neural Representations

Yuan Yin · Matthieu Kirchmeyer · Jean-Yves Franceschi · alain rakotomamonjy · patrick gallinari

Go to Event Page

Oral

Oral 5 Track 1: Unsupervised and Self-supervised learning & Social Aspects of Machine Learning-

10:00 AM - 11:30 AM

8 Events in this session

Progress measures for grokking via mechanistic interpretability

Neel Nanda · Lawrence Chan · Tom Lieberum · Jess Smith · Jacob Steinhardt

Localized Randomized Smoothing for Collective Robustness Certification

Jan Schuchardt · Tom Wollschläger · Aleksandar Bojchevski · Stephan Günnemann

Towards Interpretable Deep Reinforcement Learning with Human-Friendly Prototypes

Eoin Kenny · Mycal Tucker · Julie Shah

CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks

Tuomas Oikarinen · Tsui-Wei Weng

Model-based Causal Bayesian Optimization

Scott Sussex · Anastasia Makarova · Andreas Krause

Corrupted Image Modeling for Self-Supervised Visual Pre-Training

Yuxin Fang · Li Dong · Hangbo Bao · Xinggang Wang · Furu Wei

SimPer: Simple Self-Supervised Learning of Periodic Targets

Yuzhe Yang · Xin Liu · Jiang Wu · Silviu Borac · Dina Katabi · Ming-Zher Poh · Daniel McDuff

Simplicial Embeddings in Self-Supervised Learning and Downstream Classification

Samuel Lavoie · Christos Tsirigotis · Max Schwarzer · Ankit Vani · Mikhail Noukhovitch · Kenji Kawaguchi · Aaron Courville

Go to Event Page

Oral

Oral 5 Track 3: Deep Learning and representational learning

10:00 AM - 11:30 AM

8 Events in this session

Hungry Hungry Hippos: Towards Language Modeling with State Space Models

Dan Fu · Tri Dao · Khaled Saab · Armin Thomas · Atri Rudra · Christopher Re

Relative representations enable zero-shot latent space communication

Luca Moschella · Valentino Maiorca · Marco Fumero · Antonio Norelli · Francesco Locatello · Emanuele Rodolà

ExpressivE: A Spatio-Functional Embedding For Knowledge Graph Completion

Aleksandar Pavlović · Emanuel Sallinger

Distilling Model Failures as Directions in Latent Space

Saachi Jain · Hannah Lawrence · Ankur Moitra · Aleksander Madry

Graph Neural Networks for Link Prediction with Subgraph Sketching

Benjamin Chamberlain · Sergey Shirobokov · Emanuele Rossi · Fabrizio Frasca · Thomas Markovich · Nils Hammerla · Michael Bronstein · Max Hansmire

The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks

Blake Bordelon · Cengiz Pehlevan

REVISITING PRUNING AT INITIALIZATION THROUGH THE LENS OF RAMANUJAN GRAPH

Duc Hoang · Shiwei Liu · Radu Marculescu · Zhangyang Wang

A Closer Look at Model Adaptation using Feature Distortion and Simplicity Bias

Puja Trivedi · Danai Koutra · Jayaraman J. Thiagarajan

Go to Event Page

Oral

Oral 5 Track 4: Applications & Optimization

10:00 AM - 11:30 AM

7 Events in this session

Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language

Andy Zeng · Maria Attarian · brian ichter · Krzysztof Choromanski · Adrian Wong · Stefan Welker · Federico Tombari · Aveek Purohit · Michael Ryoo · Vikas Sindhwani · Johnny Lee · Vincent Vanhoucke · Pete Florence

Clean-image Backdoor: Attacking Multi-label Models with Poisoned Labels Only

Kangjie Chen · Xiaoxuan Lou · Guowen Xu · Jiwei Li · Tianwei Zhang

DocPrompting: Generating Code by Retrieving the Docs

Shuyan Zhou · Uri Alon · Frank F Xu · Zhengbao Jiang · Graham Neubig

View Synthesis with Sculpted Neural Points

Yiming Zuo · Jia Deng

VA-DepthNet: A Variational Approach to Single Image Depth Prediction

Ce Liu · Suryansh Kumar · Shuhang Gu · Radu Timofte · Luc Van Gool

Visual Classification via Description from Large Language Models

Sachit Menon · Carl Vondrick

Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Approach

Heshan Fernando · Han Shen · Miao Liu · Subhajit Chaudhury · Keerthiram Murugesan · Tianyi Chen

Go to Event Page

Poster

Poster Session 5

11:30 AM - 1:30 PM

143 Events in this session

A Simple Approach for Visual Room Rearrangement: 3D Mapping and Semantic Search

Brandon Trabucco · Gunnar Sigurdsson · Robinson Piramuthu · Gaurav Sukhatme · Ruslan Salakhutdinov

Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners

Seonghyeon Ye · Doyoung Kim · Joel Jang · Joongbo Shin · Minjoon Seo

Globally Injective ReLU Networks

Michael Puthawala · Konik Kothari · Matti Lassas · Ivan Dokmanić · Maarten V de Hoop

View Synthesis with Sculpted Neural Points

Yiming Zuo · Jia Deng

Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language

Calibrating Sequence likelihood Improves Conditional Language Generation

Yao Zhao · Misha Khalman · Rishabh Joshi · Shashi Narayan · Mohammad Saleh · Peter Liu

DocPrompting: Generating Code by Retrieving the Docs

Shuyan Zhou · Uri Alon · Frank F Xu · Zhengbao Jiang · Graham Neubig

DamoFD: Digging into Backbone Design on Face Detection

Yang Liu · Jiankang Deng · Fei Wang · Lei Shang · Xuansong Xie · Baigui Sun

Clean-image Backdoor: Attacking Multi-label Models with Poisoned Labels Only

Kangjie Chen · Xiaoxuan Lou · Guowen Xu · Jiwei Li · Tianwei Zhang

Weakly Supervised Explainable Phrasal Reasoning with Neural Fuzzy Logic

Zijun Wu · Zi Xuan Zhang · Atharva Naik · Zhijian Mei · Mauajama Firdaus · Lili Mou

Selective Annotation Makes Language Models Better Few-Shot Learners

Hongjin SU · Jungo Kasai · Chen Henry Wu · Weijia Shi · Tianlu Wang · Jiayi Xin · Rui Zhang · Mari Ostendorf · Luke Zettlemoyer · Noah Smith · Tao Yu

The KFIoU Loss for Rotated Object Detection

Xue Yang · Yue Zhou · Gefan Zhang · Jirui Yang · Wentao Wang · Junchi Yan · XIAOPENG ZHANG · Qi Tian

$\mathrm{SE}(3)$-Equivariant Attention Networks for Shape Reconstruction in Function Space

Evangelos Chatzipantazis · Stefanos Pertigkiozoglou · Edgar Dobriban · Kostas Daniilidis

Perfectly Secure Steganography Using Minimum Entropy Coupling

Christian Schroeder de Witt · Samuel Sokota · Zico Kolter · Jakob Foerster · Martin Strohmeier

SLTUNET: A Simple Unified Model for Sign Language Translation

Biao Zhang · Mathias Müller · Rico Sennrich

VA-DepthNet: A Variational Approach to Single Image Depth Prediction

Ce Liu · Suryansh Kumar · Shuhang Gu · Radu Timofte · Luc Van Gool

Visual Classification via Description from Large Language Models

Sachit Menon · Carl Vondrick

E-CRF: Embedded Conditional Random Field for Boundary-caused Class Weights Confusion in Semantic Segmentation

Jie Zhu · Huabin Huang · Banghuai Li · Leye Wang

Proactive Multi-Camera Collaboration for 3D Human Pose Estimation

Hai Ci · Mickel Liu · Xuehai Pan · Fangwei Zhong · Yizhou Wang

AnyDA: Anytime Domain Adaptation

Omprakash Chakraborty · Aadarsh Sahoo · Rameswar Panda · Abir Das

SMART: Sentences as Basic Units for Text Evaluation

Reinald Kim Amplayo · Peter Liu · Yao Zhao · Shashi Narayan

GAIN: On the Generalization of Instructional Action Understanding

Junlong Li · Guangyi Chen · Yansong Tang · Jinan Bao · Kun Zhang · Jie Zhou · Jiwen Lu

Scaleformer: Iterative Multi-scale Refining Transformers for Time Series Forecasting

Mohammad Amin Shabani · Amir Abdi · Lili Meng · Tristan Sylvain

Red PANDA: Disambiguating Image Anomaly Detection by Removing Nuisance Factors

Niv Cohen · Jonathan Kahana · Yedid Hoshen

Equivariant Descriptor Fields: SE(3)-Equivariant Energy-Based Models for End-to-End Visual Robotic Manipulation Learning

Hyunwoo Ryu · Hong-in Lee · Jeong-Hoon Lee · Jongeun Choi

Learning Math Reasoning from Self-Sampled Correct and Partially-Correct Solutions

Ansong Ni · Jeevana Priya Inala · Chenglong Wang · Alex Polozov · Christopher Meek · Dragomir Radev · Jianfeng Gao

Policy Pre-training for Autonomous Driving via Self-supervised Geometric Modeling

Penghao Wu · Li Chen · Hongyang Li · Xiaosong Jia · Junchi Yan · Yu Qiao

Revocable Deep Reinforcement Learning with Affinity Regularization for Outlier-Robust Graph Matching

Chang Liu · Zetian Jiang · Runzhong Wang · Lingxiao Huang · Pinyan Lu · Junchi Yan

Graph Neural Networks are Inherently Good Generalizers: Insights by Bridging GNNs and MLPs

Chenxiao Yang · Qitian Wu · Jiahua Wang · Junchi Yan

Graph Neural Networks for Link Prediction with Subgraph Sketching

Benjamin Chamberlain · Sergey Shirobokov · Emanuele Rossi · Fabrizio Frasca · Thomas Markovich · Nils Hammerla · Michael Bronstein · Max Hansmire

Confidence-Based Feature Imputation for Graphs with Partially Known Features

Daeho Um · Jiwoong Park · Seulki Park · Jin Choi

Sparse tree-based Initialization for Neural Networks

Patrick Lutz · Ludovic Arnould · Claire Boyer · Erwan Scornet

REVISITING PRUNING AT INITIALIZATION THROUGH THE LENS OF RAMANUJAN GRAPH

Duc Hoang · Shiwei Liu · Radu Marculescu · Zhangyang Wang

Self-Stabilization: The Implicit Bias of Gradient Descent at the Edge of Stability

Alex Damian · Eshaan Nichani · Jason Lee

Liquid Structural State-Space Models

Ramin Hasani · Mathias Lechner · Johnson (Tsun-Hsuan) Wang · Makram Chahine · Alexander Amini · Daniela Rus

Predictive Inference with Feature Conformal Prediction

Jiaye Teng · Chuan Wen · Dinghuai Zhang · Yoshua Bengio · Yang Gao · Yang Yuan

ExpressivE: A Spatio-Functional Embedding For Knowledge Graph Completion

Aleksandar Pavlović · Emanuel Sallinger

Out-of-distribution Detection with Implicit Outlier Transformation

Qizhou Wang · Junjie Ye · Feng Liu · Quanyu Dai · Marcus Kalander · Tongliang Liu · Jianye HAO · Bo Han

Improving Deep Regression with Ordinal Entropy

Shihao Zhang · Linlin Yang · Michael Bi Mi · Xiaoxu Zheng · Angela Yao

How I Learned to Stop Worrying and Love Retraining

Max Zimmer · Christoph Spiegel · Sebastian Pokutta

Distilling Model Failures as Directions in Latent Space

Saachi Jain · Hannah Lawrence · Ankur Moitra · Aleksander Madry

Efficient Edge Inference by Selective Query

Anil Kag · Igor Fedorov · Aditya Gangrade · Paul Whatmough · Venkatesh Saligrama

Understanding Zero-shot Adversarial Robustness for Large-Scale Models

Chengzhi Mao · Scott Geng · Junfeng Yang · Xin Wang · Carl Vondrick

A Primal-Dual Framework for Transformers and Neural Networks

TAN NGUYEN · Tam Nguyen · Nhat Ho · Andrea Bertozzi · Richard Baraniuk · Stanley J Osher

Hungry Hungry Hippos: Towards Language Modeling with State Space Models

Dan Fu · Tri Dao · Khaled Saab · Armin Thomas · Atri Rudra · Christopher Re

Implicit regularization in Heavy-ball momentum accelerated stochastic gradient descent

Avrajit Ghosh · HE LYU · Xitong Zhang · Rongrong Wang

Efficient recurrent architectures through activity sparsity and sparse back-propagation through time

Anand Subramoney · Khaleelulla Khan Nazeer · Mark Schoene · Christian Mayr · David Kappel

Diversify and Disambiguate: Out-of-Distribution Robustness via Disagreement

Yoonho Lee · Huaxiu Yao · Chelsea Finn

The Augmented Image Prior: Distilling 1000 Classes by Extrapolating from a Single Image

Yuki Asano · Aaqib Saeed

How Much Data Are Augmentations Worth? An Investigation into Scaling Laws, Invariance, and Implicit Regularization

Jonas Geiping · Micah Goldblum · Gowthami Somepalli · Ravid Shwartz-Ziv · Tom Goldstein · Andrew Wilson

A Closer Look at Model Adaptation using Feature Distortion and Simplicity Bias

Puja Trivedi · Danai Koutra · Jayaraman J. Thiagarajan

Surgical Fine-Tuning Improves Adaptation to Distribution Shifts

Yoonho Lee · Annie Chen · Fahim Tajwar · Ananya Kumar · Huaxiu Yao · Percy Liang · Chelsea Finn

Linear Connectivity Reveals Generalization Strategies

Jeevesh Juneja · Rachit Bansal · Kyunghyun Cho · João Sedoc · Naomi Saphra

PLOT: Prompt Learning with Optimal Transport for Vision-Language Models

Guangyi Chen · Weiran Yao · Xiangchen Song · Xinyue Li · Yongming Rao · Kun Zhang

Learning with Logical Constraints but without Shortcut Satisfaction

Zenan Li · Zehua Liu · Yuan Yao · Jingwei Xu · Taolue Chen · Xiaoxing Ma · Jian Lu

Confidence Estimation Using Unlabeled Data

Chen Li · Xiaoling Hu · Chao Chen

NORM: Knowledge Distillation via N-to-One Representation Matching

Xiaolong Liu · Lujun Li · Chao Li · Anbang Yao

Does Deep Learning Learn to Abstract? A Systematic Probing Framework

Shengnan An · Zeqi Lin · Bei Chen · Qiang Fu · Nanning Zheng · Jian-Guang Lou

The Influence of Learning Rule on Representation Dynamics in Wide Neural Networks

Blake Bordelon · Cengiz Pehlevan

A Generalized Projected Bellman Error for Off-policy Value Estimation in Reinforcement Learning

Andrew Patterson · Adam White · Martha White

Is Forgetting Less a Good Inductive Bias for Forward Transfer?

Jiefeng Chen · Timothy Nguyen · Dilan Gorur · Arslan Chaudhry

Deep Learning on Implicit Neural Representations of Shapes

Luca De Luigi · Adriano Cardace · Riccardo Spezialetti · Pierluigi Zama Ramirez · Samuele Salti · Luigi Di Stefano

FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning

Yidong Wang · Hao Chen · Qiang Heng · Wenxin Hou · Yue Fan · Zhen Wu · Jindong Wang · Marios Savvides · Takahiro Shinozaki · Bhiksha Raj · Bernt Schiele · Xing Xie

Bispectral Neural Networks

Sophia Sanborn · Christian Shewmake · Bruno Olshausen · Christopher Hillar

Bias Propagation in Federated Learning

Hongyan Chang · Reza Shokri

On The Inadequacy of Optimizing Alignment and Uniformity in Contrastive Learning of Sentence Representations

Zhijie Nie · Richong Zhang · Yongyi Mao

Relative representations enable zero-shot latent space communication

Luca Moschella · Valentino Maiorca · Marco Fumero · Antonio Norelli · Francesco Locatello · Emanuele Rodolà

No Reason for No Supervision: Improved Generalization in Supervised Models

Mert Bulent Sariyildiz · Yannis Kalantidis · Karteek Alahari · Diane Larlus

DCI-ES: An Extended Disentanglement Framework with Connections to Identifiability

Cian Eastwood · Andrei L Nicolicioiu · Julius von Kügelgen · Armin Kekic · Frederik Träuble · Andrea Dittadi · Bernhard Schoelkopf

Recursive Time Series Data Augmentation

Amine Aboussalah · Minjae Kwon · Raj Patel · Cheng Chi · Chi-Guhn Lee

Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?

Runpei Dong · Zekun Qi · Linfeng Zhang · Junbo Zhang · Jianjian Sun · Zheng Ge · Li Yi · Kaisheng Ma

Leveraging Importance Weights in Subset Selection

Gui Citovsky · Giulia DeSalvo · Sanjiv Kumar · Srikumar Ramalingam · Afshin Rostamizadeh · Yunjuan Wang

Generative Modeling Helps Weak Supervision (and Vice Versa)

Benedikt Boecking · Nicholas Roberts · Willie Neiswanger · Stefano Ermon · Frederic Sala · Artur Dubrawski

Discovering Evolution Strategies via Meta-Black-Box Optimization

Robert Lange · Tom Schaul · Yutian Chen · Tom Zahavy · Valentin Dalibard · Chris Lu · Satinder Singh · Sebastian Flennerhag

DAG Learning on the Permutahedron

Valentina Zantedeschi · Luca Franceschi · Jean Kaddour · Matt Kusner · Vlad Niculae

Kernel Neural Optimal Transport

Alexander Korotin · Daniil Selikhanovych · Evgeny Burnaev

DiGress: Discrete Denoising diffusion for graph generation

Clément Vignac · Igor Krawczuk · Antoine Siraudin · Bohan Wang · Volkan Cevher · Pascal Frossard

Neural Architecture Design and Robustness: A Dataset

Steffen Jung · Jovita Lukasik · Margret Keuper

An Extensible Multi-modal Multi-task Object Dataset with Materials

Trevor Standley · Ruohan Gao · Dawn Chen · Jiajun Wu · Silvio Savarese

ManiSkill2: A Unified Benchmark for Generalizable Manipulation Skills

Jiayuan Gu · Fanbo Xiang · Xuanlin Li · Zhan Ling · Xiqiang Liu · Tongzhou Mu · Yihe Tang · Stone Tao · Xinyue Wei · Yunchao Yao · Xiaodi Yuan · Pengwei Xie · Zhiao Huang · Rui Chen · Hao Su

FINDE: Neural Differential Equations for Finding and Preserving Invariant Quantities

Takashi Matsubara · Takaharu Yaguchi

Clifford Neural Layers for PDE Modeling

Johannes Brandstetter · Rianne van den Berg · Max Welling · Jayesh Gupta

A Self-Attention Ansatz for Ab-initio Quantum Chemistry

Ingrid von Glehn · James Spencer · David Pfau

Learning differentiable solvers for systems with hard constraints

Geoffrey Négiar · Michael W Mahoney · Aditi Krishnapriyan

Learning Domain-Agnostic Representation for Disease Diagnosis

Churan Wang · Jing Li · Xinwei Sun · Fandong Zhang · Yizhou Yu · Yizhou Wang

GAMR: A Guided Attention Model for (visual) Reasoning

Mohit Vaishnav · Thomas Serre

RandProx: Primal-Dual Optimization Algorithms with Randomized Proximal Updates

Laurent Condat · Peter Richtarik

Finding Actual Descent Directions for Adversarial Training

Fabian Latorre · Igor Krawczuk · Leello Dadi · Thomas Pethick · Volkan Cevher

DASHA: Distributed Nonconvex Optimization with Communication Compression and Optimal Oracle Complexity

Alexander Tyurin · Peter Richtarik

Solving Constrained Variational Inequalities via a First-order Interior Point-based Method

Tong Yang · Michael Jordan · Tatjana Chavdarova

LMC: Fast Training of GNNs via Subgraph Sampling with Provable Convergence

Zhihao Shi · Xize Liang · Jie Wang

An Adaptive Policy to Employ Sharpness-Aware Minimization

Weisen JIANG · Hansi Yang · Yu Zhang · James Kwok

Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Approach

Heshan Fernando · Han Shen · Miao Liu · Subhajit Chaudhury · Keerthiram Murugesan · Tianyi Chen

FedExP: Speeding Up Federated Averaging via Extrapolation

Divyansh Jhunjhunwala · Shiqiang Wang · Gauri Joshi

Multi-Objective Online Learning

Jiyan Jiang · Wenpeng Zhang · Shiji Zhou · Lihong Gu · Xiaodong Zeng · Wenwu Zhu

Single-shot General Hyper-parameter Optimization for Federated Learning

Yi Zhou · Parikshit Ram · Theodoros Salonidis · Nathalie Baracaldo · Horst Samulowitz · Heiko Ludwig

SWIFT: Rapid Decentralized Federated Learning via Wait-Free Model Communication

Marco Bornstein · Tahseen Rabbani · Evan Wang · Amrit Bedi · Furong Huang

Robustness to corruption in pre-trained Bayesian neural networks

Xi Wang · Laurence Aitchison

Particle-based Variational Inference with Preconditioned Functional Gradient Flow

Hanze Dong · Xi Wang · LIN Yong · Tong Zhang

Calibrating Transformers via Sparse Gaussian Processes

Wenlong Chen · Yingzhen Li

Can Agents Run Relay Race with Strangers? Generalization of RL to Out-of-Distribution Trajectories

Li-Cheng Lan · Huan Zhang · Cho-Jui Hsieh

Measuring axiomatic soundness of counterfactual image models

Miguel Monteiro · Fabio De Sousa Ribeiro · Nick Pawlowski · Daniel Castro · Ben Glocker

Latent State Marginalization as a Low-cost Approach for Improving Exploration

Dinghuai Zhang · Aaron Courville · Yoshua Bengio · Qinqing Zheng · Amy Zhang · Ricky T. Q. Chen

Model-based Causal Bayesian Optimization

Scott Sussex · Anastasia Makarova · Andreas Krause

Hybrid RL: Using both offline and online data can make RL efficient

Yuda Song · Yifei Zhou · Ayush Sekhari · Drew Bagnell · Akshay Krishnamurthy · Wen Sun

Value Memory Graph: A Graph-Structured World Model for Offline Reinforcement Learning

Deyao Zhu · Li Li · Mohamed Elhoseiny

Generating Diverse Cooperative Agents by Learning Incompatible Policies

Rujikorn Charakorn · Poramate Manoonpong · Nat Dilokthanakul

User-Interactive Offline Reinforcement Learning

Phillip Swazinna · Steffen Udluft · Thomas A. Runkler

Simple Emergent Action Representations from Multi-Task Policy Training

Pu Hua · Yubei Chen · Huazhe Xu

Timing is Everything: Learning to Act Selectively with Costly Actions and Budgetary Constraints

David Mguni · Aivar Sootla · Juliusz Ziomek · Oliver Slumbers · Zipeng Dai · Kun Shao · Jun Wang

Proto-Value Networks: Scaling Representation Learning with Auxiliary Tasks

Jesse Farebrother · Joshua Greaves · Rishabh Agarwal · Charline Le Lan · Ross Goroshin · Pablo Samuel Castro · Marc G Bellemare

Energy-based Out-of-Distribution Detection for Graph Neural Networks

Qitian Wu · Yiting Chen · Chenxiao Yang · Junchi Yan

Localized Randomized Smoothing for Collective Robustness Certification

Jan Schuchardt · Tom Wollschläger · Aleksandar Bojchevski · Stephan Günnemann

Planning with Sequence Models through Iterative Energy Minimization

Hongyi Chen · Yilun Du · Yiye Chen · Joshua B Tenenbaum · Patricio Vela

Robust Explanation Constraints for Neural Networks

Matthew Wicker · Juyeon Heo · Luca Costabello · Adrian Weller

Strategic Classification with Graph Neural Networks

Itay Eilat · Ben Finkelshtein · Chaim Baskin · Nir Rosenfeld

Discovering Latent Knowledge in Language Models Without Supervision

Collin Burns · Haotian Ye · Dan Klein · Jacob Steinhardt

Concept Gradient: Concept-based Interpretation Without Linear Assumption

Andrew Bai · Chih-Kuan Yeh · Neil Lin · Pradeep K Ravikumar · Cho-Jui Hsieh

A law of adversarial risk, interpolation, and label noise

Daniel Paleka · Amartya Sanyal

Measuring Forgetting of Memorized Training Examples

Matthew Jagielski · Om Thakkar · Florian Tramer · Daphne Ippolito · Katherine Lee · Nicholas Carlini · Eric Wallace · Shuang Song · Abhradeep Guha Thakurta · Nicolas Papernot · Chiyuan Zhang

Progress measures for grokking via mechanistic interpretability

Neel Nanda · Lawrence Chan · Tom Lieberum · Jess Smith · Jacob Steinhardt

Everybody Needs Good Neighbours: An Unsupervised Locality-based Method for Bias Mitigation

Xudong Han · Timothy Baldwin · Trevor Cohn

CLIP-Dissect: Automatic Description of Neuron Representations in Deep Vision Networks

Tuomas Oikarinen · Tsui-Wei Weng

Individual Privacy Accounting with Gaussian Differential Privacy

Antti Koskela · Marlon Tobaben · Antti Honkela

Temporal Dependencies in Feature Importance for Time Series Prediction

Kin Kwan Leung · Clayton Rooke · Jonathan Smith · Saba Zuberi · Maksims Volkovs

Explaining RL Decisions with Trajectories

Shripad Deshmukh · Arpan Dasgupta · Balaji Krishnamurthy · Nan Jiang · Chirag Agarwal · Georgios Theocharous · Jayakumar Subramanian

Stochastic Differentially Private and Fair Learning

Andrew Lowy · Devansh Gupta · Meisam Razaviyayn

Towards Interpretable Deep Reinforcement Learning with Human-Friendly Prototypes

Eoin Kenny · Mycal Tucker · Julie Shah

How robust is unsupervised representation learning to distribution shift?

Yuge Shi · Imant Daunhawer · Julia E Vogt · Philip Torr · Amartya Sanyal

A Non-Asymptotic Analysis of Oversmoothing in Graph Neural Networks

Xinyi Wu · Zhengdao Chen · William Wang · Ali Jadbabaie

On the Saturation Effect of Kernel Ridge Regression

Yicheng Li · Haobo Zhang · Qian Lin

Characterizing the spectrum of the NTK via a power series expansion

Michael Murray · Hui Jin · Benjamin Bowman · Guido Montufar

Collaborative Pure Exploration in Kernel Bandit

Yihan Du · Wei Chen · Yuko Kuroki · Longbo Huang

Learning ReLU networks to high uniform accuracy is intractable

Julius Berner · Philipp Grohs · Felix Voigtlaender

Understanding The Robustness of Self-supervised Learning Through Topic Modeling

Zeping Luo · Shiyou Wu · Cindy Weng · Mo Zhou · Rong Ge

Bidirectional Language Models Are Also Few-shot Learners

Ajay Patel · Bryan Li · Mohammad Rasooli · Noah Constant · Colin Raffel · Chris Callison-Burch

Self-Supervised Category-Level Articulated Object Pose Estimation with Part-Level SE(3) Equivariance

Xueyi Liu · Ji Zhang · Ruizhen Hu · Haibin Huang · He Wang · Li Yi

From $t$-SNE to UMAP with contrastive learning

Sebastian Damrich · Niklas Böhm · Fred A Hamprecht · Dmitry Kobak

Simplicial Embeddings in Self-Supervised Learning and Downstream Classification

Samuel Lavoie · Christos Tsirigotis · Max Schwarzer · Ankit Vani · Mikhail Noukhovitch · Kenji Kawaguchi · Aaron Courville

Corrupted Image Modeling for Self-Supervised Visual Pre-Training

Yuxin Fang · Li Dong · Hangbo Bao · Xinggang Wang · Furu Wei

SimPer: Simple Self-Supervised Learning of Periodic Targets

Yuzhe Yang · Xin Liu · Jiang Wu · Silviu Borac · Dina Katabi · Ming-Zher Poh · Daniel McDuff

What Do Self-Supervised Vision Transformers Learn?

Namuk Park · Wonjae Kim · Byeongho Heo · Taekyung Kim · Sangdoo Yun

Human-level Atari 200x faster

Steven Kapturowski · Víctor Campos · Ray Jiang · Nemanja Rakicevic · Hado van Hasselt · Charles Blundell · Adria Puigdomenech Badia

Go to Event Page

Town Hall

Town Hall: LLMs in the ICLR Writing Process?

Alexander Rush · Been Kim · Yan Liu

12:30 PM - 1:00 PM

Topic: How much of using LLM should be allowed in academic paper writing? Pros and Cons?

Open discussion led by Sasha Rush (ICLR board) and ICLR 2023 Organizing committee.

... more

Invited Talk

Learned optimizers: why they're the future, why they’re hard, and what they can do now

Jascha Sohl-Dickstein

1:30 PM - 2:30 PM

The success of deep learning has hinged on learned functions dramatically outperforming hand-designed functions for many tasks. However, we still train models using hand designed optimizers acting on hand designed loss functions. I will argue that these hand designed components are typically mismatched to the desired behavior, and that we can expect meta-learned optimizers to perform much better. I will discuss the challenges and pathologies that make meta-training learned optimizers difficult. These include: chaotic and high variance meta-loss landscapes; extreme computational costs for meta-training; lack of comprehensive meta-training datasets; challenges designing learned optimizers with the right inductive biases; challenges interpreting the method of action of learned optimizers. I will share solutions to some of these challenges. I will show experimental results where learned optimizers outperform hand-designed optimizers in many contexts, and I will discuss novel capabilities that are enabled by meta-training learned optimizers.

... more

Speaker Bio

I am a principal scientist in Google DeepMind, where I lead a research team with interests spanning machine learning, physics, and neuroscience. I'm most (in)famous for [inventing diffusion models](https://arxiv.org/abs/1503.03585). My recent work has focused on theory of [overparameterized neural networks](https://github.com/google/neural-tangents/wiki/Overparameterized-Neural-Networks:-Theory-and-Empirics), meta-training of [learned optimizers](https://arxiv.org/abs/2009.11243), and [understanding the capabilities of large language models](https://github.com/google/BIG-Bench). Before working at Google I was a visiting scholar in [Surya Ganguli's lab](http://ganguli-gang.stanford.edu/) at Stanford University, and an academic resident at [Khan Academy](http://khanacademy.org/). I earned my PhD in 2012 in the [Redwood Center for Theoretical Neuroscience](http://redwood.berkeley.edu/) at UC Berkeley, in [Bruno Olshausen's](https://redwood.berkeley.edu/bruno/) lab. Prior to my PhD, I [worked on Mars](http://mars.nasa.gov/mer/home/).

... more

Remarks

Closing Ceremony

2:30 PM - 2:35 PM

Affinity Event

Women in Machine Learning Social

Tatjana Chavdarova · Caroline Weis

3:00 PM - 5:00 PM

Oral

Oral 6 Track 4: Applications & Social Aspects of Machine Learning & General Machine Learning

3:00 PM - 4:30 PM

8 Events in this session

Binding Language Models in Symbolic Languages

Zhoujun Cheng · Tianbao Xie · Peng Shi · Chengzu Li · Rahul Nadkarni · Yushi Hu · Caiming Xiong · Dragomir Radev · Mari Ostendorf · Luke Zettlemoyer · Noah Smith · Tao Yu

MeshDiffusion: Score-based Generative 3D Mesh Modeling

Zhen Liu · Yao Feng · Michael J Black · Derek Nowrouzezahrai · Liam Paull · Weiyang Liu

The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation

Zihui Xue · Zhengqi Gao · Sucheng Ren · Hang Zhao

AutoGT: Automated Graph Transformer Architecture Search

Zizhao Zhang · Xin Wang · Chaoyu Guan · Ziwei Zhang · Haoyang Li · Wenwu Zhu

Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model

Yinhuai Wang · Jiwen Yu · Jian Zhang

LightGCL: Simple Yet Effective Graph Contrastive Learning for Recommendation

Xuheng Cai · Chao Huang · Lianghao Xia · Xubin Ren

Certified Training: Small Boxes are All You Need

Mark N Müller · Franziska Eckert · Marc Fischer · Martin Vechev

Inequality phenomenon in $l_{\infty}$-adversarial training, and its unrealized threats

Ranjie Duan · YueFeng Chen · Yao Zhu · Xiaojun Jia · Rong Zhang · Hui Xue'

Go to Event Page

Oral

Oral 6 Track 5: Applications- & Deep Learning and representational learning

3:00 PM - 4:30 PM

8 Events in this session

Language Modelling with Pixels

Phillip Rust · Jonas F. Lotz · Emanuele Bugliarello · Elizabeth Salesky · Miryam de Lhoneux · Desmond Elliott

Parametrizing Product Shape Manifolds by Composite Networks

Josua Sassen · Klaus Hildebrandt · Martin Rumpf · Benedikt Wirth

ImageNet-X: Understanding Model Mistakes with Factor of Variation Annotations

Badr Youbi Idrissi · Diane Bouchacourt · Randall Balestriero · Ivan Evtimov · Caner Hazirbas · Nicolas Ballas · Pascal Vincent · Michal Drozdzal · David Lopez-Paz · Mark Ibrahim

Data Continuity Matters: Improving Sequence Modeling with Lipschitz Regularizer

Eric Qu · Xufang Luo · Dongsheng Li

Dual Algorithmic Reasoning

Danilo Numeroso · Davide Bacciu · Petar Veličković

DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion

Qitian Wu · Chenxiao Yang · Wentao Zhao · Yixuan He · David Wipf · Junchi Yan

Warping the Space: Weight Space Rotation for Class-Incremental Few-Shot Learning

Do-Yeon Kim · Dong-Jun Han · Jun Seo · Jaekyun Moon

Last Layer Re-Training is Sufficient for Robustness to Spurious Correlations

Polina Kirichenko · Pavel Izmailov · Andrew Wilson

Go to Event Page

Oral

Oral 6 Track 3: Deep Learning and representational learning

3:00 PM - 4:30 PM

8 Events in this session

Agree to Disagree: Diversity through Disagreement for Better Transferability

Matteo Pagliardini · Martin Jaggi · François Fleuret · Sai Karimireddy

What learning algorithm is in-context learning? Investigations with linear models

Ekin Akyürek · Dale Schuurmans · Jacob Andreas · Tengyu Ma · Denny Zhou

Addressing Parameter Choice Issues in Unsupervised Domain Adaptation by Aggregation

Marius-Constantin Dinu · Markus Holzleitner · Maximilian Beck · Hoan Nguyen · Andrea Huber · Hamid Eghbalzadeh · Bernhard A. Moser · Sergei Pereverzyev · Sepp Hochreiter · Werner Zellinger

Encoding Recurrence into Transformers

Feiqing Huang · Kexin Lu · Yuxi Cai · Zhen Qin · Yanwen Fang · Guangjian Tian · Guodong Li

Universal Few-shot Learning of Dense Prediction Tasks with Visual Token Matching

Donggyun Kim · Jinwoo Kim · Seongwoong Cho · Chong Luo · Seunghoon Hong

Simplified State Space Layers for Sequence Modeling

Jimmy Smith · andrew warrington · Scott Linderman

Relational Attention: Generalizing Transformers for Graph-Structured Tasks

Cameron Diao · Ricky Loynd

Sparse Mixture-of-Experts are Domain Generalizable Learners

Bo Li · Yifei Shen · Jingkang Yang · Yezhen Wang · Jiawei Ren · Tong Che · Jun Zhang · Ziwei Liu

Go to Event Page

Oral

Oral 6 Track 2: Infrastructure & Social Aspects of Machine Learning

3:00 PM - 4:30 PM

7 Events in this session

DaxBench: Benchmarking Deformable Object Manipulation with Differentiable Physics

Siwei Chen · Yiqing Xu · Cunjun Yu · Linfeng Li · Xiao Ma · Zhongwen Xu · David Hsu

Betty: An Automatic Differentiation Library for Multilevel Optimization

Sang Choe · Willie Neiswanger · Pengtao Xie · Eric Xing

WikiWhy: Answering and Explaining Cause-and-Effect Questions

Matthew Ho · Aditya Sharma · Justin Chang · Michael Saxon · Sharon Levy · Yujie Lu · William Wang

MEDFAIR: Benchmarking Fairness for Medical Imaging

Yongshuo Zong · Yongxin Yang · Timothy Hospedales

Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation

Lorenz Kuhn · Yarin Gal · Sebastian Farquhar

Confidential-PROFITT: Confidential PROof of FaIr Training of Trees

Ali Shahin Shamsabadi · Sierra Wyllie · Nicholas Franzese · Natalie Dullerud · Sébastien Gambs · Nicolas Papernot · Xiao Wang · Adrian Weller

Disparate Impact in Differential Privacy from Gradient Misalignment

Maria Esipova · Atiyeh Ashari Ghomi · Yaqiao Luo · Jesse Cresswell

Go to Event Page

Oral

Oral 6 Track 6: Deep Learning

3:00 PM - 4:30 PM

5 Events in this session

Unsupervised Model Selection for Time Series Anomaly Detection

Mononito Goswami · Cristian Challu · Laurent Callot · Lenon Minorics · Andrey Kan

A Kernel Perspective of Skip Connections in Convolutional Networks

Daniel Barzilai · Amnon Geifman · Meirav Galun · Ronen Basri

ReAct: Synergizing Reasoning and Acting in Language Models

Shunyu Yao · Jeffrey Zhao · Dian Yu · Nan Du · Izhak Shafran · Karthik Narasimhan · Yuan Cao

A framework for benchmarking Class-out-of-distribution detection and its application to ImageNet

Ido Galil · Mohammed Dabbah · Ran El-Yaniv

Packed Ensembles for efficient uncertainty estimation

Olivier Laurent · Adrien Lafage · Enzo Tartaglione · Geoffrey Daniel · Jean-marc Martinez · Andrei Bursuc · Gianni Franchi

Go to Event Page

Oral

Oral 6 Track 1: Theory

3:00 PM - 4:30 PM

8 Events in this session

Near-optimal Coresets for Robust Clustering

Lingxiao Huang · Shaofeng Jiang · Jianing Lou · Xuan Wu

Efficiently Computing Nash Equilibria in Adversarial Team Markov Games

Fivos Kalogiannis · Ioannis Anagnostides · Ioannis Panageas · Emmanouil-Vasileios Vlatakis-Gkaragkounis · Vaggos Chatziafratis · Stelios Stavroulakis

Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning

Zeyuan Allen-Zhu · Yuanzhi Li

Statistical Efficiency of Score Matching: The View from Isoperimetry

Frederic Koehler · Alexander Heckett · Andrej Risteski

Subquadratic Algorithms for Kernel Matrices via Kernel Density Estimation

Ainesh Bakshi · Piotr Indyk · Praneeth Kacham · Sandeep Silwal · Samson Zhou

Depth Separation with Multilayer Mean-Field Networks

Yunwei Ren · Mo Zhou · Rong Ge

Learning with Stochastic Orders

Carles Domingo i Enrich · Yair Schiff · Youssef Mroueh

Nonlinear Reconstruction for Operator Learning of PDEs with Discontinuities

Samuel Lanthaler · Roberto Molinaro · Patrik Hadorn · Siddhartha Mishra

Go to Event Page

Poster

Poster Session 6

4:30 PM - 6:30 PM

145 Events in this session

Summarization Programs: Interpretable Abstractive Summarization with Neural Modular Trees

Swarnadeep Saha · Shiyue Zhang · Peter Hase · Mohit Bansal

Binding Language Models in Symbolic Languages

Zhoujun Cheng · Tianbao Xie · Peng Shi · Chengzu Li · Rahul Nadkarni · Yushi Hu · Caiming Xiong · Dragomir Radev · Mari Ostendorf · Luke Zettlemoyer · Noah Smith · Tao Yu

WiNeRT: Towards Neural Ray Tracing for Wireless Channel Modelling and Differentiable Simulations

Tribhuvanesh Orekondy · Kumar Pratik Kumar Pratik · Shreya Kadambi · Hao Ye · Joseph Soriaga · Arash Behboodi

Static Prediction of Runtime Errors by Learning to Execute Programs with External Resource Descriptions

David Bieber · Rishab Goel · Daniel Zheng · Hugo Larochelle · Daniel Tarlow

ReAct: Synergizing Reasoning and Acting in Language Models

Shunyu Yao · Jeffrey Zhao · Dian Yu · Nan Du · Izhak Shafran · Karthik Narasimhan · Yuan Cao

Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences

Alan Chan · Hugo Silva · Sungsu Lim · Tadashi Kozuno · A. Rupam Mahmood · Martha White

Automating Nearest Neighbor Search Configuration with Constrained Optimization

Philip Sun · Ruiqi Guo · Sanjiv Kumar

On the Robustness to Misspecification of α-posteriors and Their Variational Approximations

Marco Avella Medina · José Luis Montiel Olea · Cynthia Rush · Amilcar Velez

Leveraging Future Relationship Reasoning for Vehicle Trajectory Prediction

Daehee Park · Hobin Ryu · Yunseo Yang · Jegyeong Cho · Jiwon Kim · Kuk-Jin Yoon

CodeBPE: Investigating Subtokenization Options for Large Language Model Pretraining on Source Code

Nadezhda Chirkova · Sergei Troshin

GOOD: Exploring geometric cues for detecting objects in an open world

Haiwen Huang · Andreas Geiger · Dan Zhang

Short-Term Memory Convolutions

Grzegorz Stefański · Krzysztof Arendt · Paweł Daniluk · Bartłomiej Jasik · Artur Szumaczuk

Zero-Shot Image Restoration Using Denoising Diffusion Null-Space Model

Yinhuai Wang · Jiwen Yu · Jian Zhang

Diagnosing and Rectifying Vision Models using Language

Yuhui Zhang · Jeff Z. HaoChen · Shih-Cheng Huang · Kuan-Chieh Wang · James Y Zou · Serena Yeung

Real-Time Image Demoir$\acute{e}$ing on Mobile Devices

Yuxin Zhang · Mingbao Lin · Xunchao Li · Han Liu · Guozhi Wang · Fei Chao · Ren Shuai · Yafei Wen · Xiaoxin Chen · Rongrong Ji

Switch-NeRF: Learning Scene Decomposition with Mixture of Experts for Large-scale Neural Radiance Fields

Zhenxing Mi · Dan Xu

Is Attention All That NeRF Needs?

Mukund Varma T · Peihao Wang · Xuxi Chen · Tianlong Chen · Subhashini Venugopalan · Zhangyang Wang

Consolidator: Mergable Adapter with Group Connections for Visual Adaptation

Tianxiang Hao · Hui Chen · Yuchen Guo · Guiguang Ding

DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training

Joya Chen · Kai Xu · Yuhui Wang · Yifei Cheng · Angela Yao

Learning Uncertainty for Unknown Domains with Zero-Target-Assumption

Yu Yu · Hassan Sajjad · Jia Xu

Offline RL for Natural Language Generation with Implicit Language Q Learning

Charlie Snell · Ilya Kostrikov · Yi Su · Sherry Yang · Sergey Levine

LightGCL: Simple Yet Effective Graph Contrastive Learning for Recommendation

Xuheng Cai · Chao Huang · Lianghao Xia · Xubin Ren

Language Modelling with Pixels

Phillip Rust · Jonas F. Lotz · Emanuele Bugliarello · Elizabeth Salesky · Miryam de Lhoneux · Desmond Elliott

CLIPSep: Learning Text-queried Sound Separation with Noisy Unlabeled Videos

Hao-Wen Dong · Naoya Takahashi · Yuki Mitsufuji · Julian McAuley · Taylor Berg-Kirkpatrick

A framework for benchmarking Class-out-of-distribution detection and its application to ImageNet

Ido Galil · Mohammed Dabbah · Ran El-Yaniv

Reversible Column Networks

Yuxuan Cai · Yizhuang Zhou · Qi Han · Jianjian Sun · Xiangwen Kong · Jun Li · Xiangyu Zhang

AutoGT: Automated Graph Transformer Architecture Search

Zizhao Zhang · Xin Wang · Chaoyu Guan · Ziwei Zhang · Haoyang Li · Wenwu Zhu

Compositionality with Variation Reliably Emerges in Neural Networks

Henry Conklin · Kenny Smith

Equivariance-aware Architectural Optimization of Neural Networks

Kaitlin Maile · Dennis Wilson · Patrick Forré

Semi-Parametric Inducing Point Networks and Neural Processes

Richa Rastogi · Yair Schiff · Alon Hacohen · Zhaozhi Li · Yi-Yuan Lee · Yuntian Deng · Mert Sabuncu · Volodymyr Kuleshov

Relational Attention: Generalizing Transformers for Graph-Structured Tasks

Cameron Diao · Ricky Loynd

Parametrizing Product Shape Manifolds by Composite Networks

Josua Sassen · Klaus Hildebrandt · Martin Rumpf · Benedikt Wirth

DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion

Qitian Wu · Chenxiao Yang · Wentao Zhao · Yixuan He · David Wipf · Junchi Yan

PowerQuant: Automorphism Search for Non-Uniform Quantization

Edouard YVINEC · Arnaud Dapogny · MATTHIEU CORD · Kevin Bailly

Over-parameterized Model Optimization with Polyak-{\L}ojasiewicz Condition

Yixuan Chen · Yubin Shi · Mingzhi Dong · Xiaochen Yang · Dongsheng Li · Yujiang Wang · Robert Dick · Qin Lv · Yingying Zhao · Fan Yang · Ning Gu · Li Shang

Last Layer Re-Training is Sufficient for Robustness to Spurious Correlations

Polina Kirichenko · Pavel Izmailov · Andrew Wilson

Ordered GNN: Ordering Message Passing to Deal with Heterophily and Over-smoothing

Yunchong Song · Chenghu Zhou · Xinbing Wang · Zhouhan Lin

What Is Missing in IRM Training and Evaluation? Challenges and Solutions

Yihua Zhang · Pranay Sharma · Parikshit Ram · Mingyi Hong · Kush Varshney · Sijia Liu

Dual Algorithmic Reasoning

Danilo Numeroso · Davide Bacciu · Petar Veličković

ImageNet-X: Understanding Model Mistakes with Factor of Variation Annotations

Badr Youbi Idrissi · Diane Bouchacourt · Randall Balestriero · Ivan Evtimov · Caner Hazirbas · Nicolas Ballas · Pascal Vincent · Michal Drozdzal · David Lopez-Paz · Mark Ibrahim

MA-BERT: Towards Matrix Arithmetic-only BERT Inference by Eliminating Complex Non-Linear Functions

Wei Ming Neo · Zhehui Wang · Cheng Liu · Rick Goh · Tao Luo

Composing Ensembles of Pre-trained Models via Iterative Consensus

Shuang Li · Yilun Du · Joshua B Tenenbaum · Antonio Torralba · Igor Mordatch

Hierarchical Abstraction for Combinatorial Generalization in Object Rearrangement

Michael Chang · Alyssa Li Dayan · Franziska Meier · Thomas L. Griffiths · Sergey Levine · Amy Zhang

$\Lambda$-DARTS: Mitigating Performance Collapse by Harmonizing Operation Selection among Cells

Sajad Movahedi · Melika Adabinejad · Ayyoob Imani · Arezou Keshavarz · Mostafa Dehghani · Azadeh Shakery · Babak Araabi

Encoding Recurrence into Transformers

Feiqing Huang · Kexin Lu · Yuxi Cai · Zhen Qin · Yanwen Fang · Guangjian Tian · Guodong Li

Agree to Disagree: Diversity through Disagreement for Better Transferability

Matteo Pagliardini · Martin Jaggi · François Fleuret · Sai Karimireddy

Deep Transformers without Shortcuts: Modifying Self-attention for Faithful Signal Propagation

Bobby He · James Martens · Guodong Zhang · Aleksandar Botev · Andrew Brock · Samuel L Smith · Yee Whye Teh

More ConvNets in the 2020s: Scaling up Kernels Beyond 51x51 using Sparsity

Shiwei Liu · Tianlong Chen · Xiaohan Chen · Xuxi Chen · Qiao Xiao · Boqian Wu · Tommi Kärkkäinen · Mykola Pechenizkiy · Decebal Constantin Mocanu · Zhangyang Wang

Addressing Parameter Choice Issues in Unsupervised Domain Adaptation by Aggregation

Marius-Constantin Dinu · Markus Holzleitner · Maximilian Beck · Hoan Nguyen · Andrea Huber · Hamid Eghbalzadeh · Bernhard A. Moser · Sergei Pereverzyev · Sepp Hochreiter · Werner Zellinger

TTN: A Domain-Shift Aware Batch Normalization in Test-Time Adaptation

Hyesu Lim · Byeonggeun Kim · Jaegul Choo · Sungha Choi

Lossless Adaptation of Pretrained Vision Models For Robotic Manipulation

Mohit Sharma · Claudio Fantacci · Yuxiang Zhou · Skanda Koppula · Nicolas Heess · Jonathan Scholz · Yusuf Aytar

Verifying the Union of Manifolds Hypothesis for Image Data

Bradley Brown · Anthony Caterini · Brendan Ross · Jesse Cresswell · Gabriel Loaiza-Ganem

Unveiling the sampling density in non-uniform geometric graphs

Raffaele Paolino · Aleksandar Bojchevski · Stephan Günnemann · Gitta Kutyniok · Ron Levie

Geometrically regularized autoencoders for non-Euclidean data

Cheongjae Jang · Yonghyeon Lee · Yung-Kyun Noh · Frank Chongwoo Park

Simplified State Space Layers for Sequence Modeling

Jimmy Smith · andrew warrington · Scott Linderman

Data Continuity Matters: Improving Sequence Modeling with Lipschitz Regularizer

Eric Qu · Xufang Luo · Dongsheng Li

Write and Paint: Generative Vision-Language Models are Unified Modal Learners

Shizhe Diao · Wangchunshu Zhou · Xinsong Zhang · Jiawei Wang

$k$NN Prompting: Beyond-Context Learning with Calibration-Free Nearest Neighbor Inference

Benfeng Xu · Quan Wang · Zhendong Mao · Yajuan Lyu · Qiaoqiao She · Yongdong Zhang

A Simple Yet Powerful Deep Active Learning With Snapshots Ensembles

Seohyeon Jung · Sanghyun Kim · Juho Lee

Packed Ensembles for efficient uncertainty estimation

Olivier Laurent · Adrien Lafage · Enzo Tartaglione · Geoffrey Daniel · Jean-marc Martinez · Andrei Bursuc · Gianni Franchi

What learning algorithm is in-context learning? Investigations with linear models

Ekin Akyürek · Dale Schuurmans · Jacob Andreas · Tengyu Ma · Denny Zhou

Part-Based Models Improve Adversarial Robustness

Chawin Sitawarin · Kornrapat Pongmala · Yizheng Chen · Nicholas Carlini · David Wagner

Effectively Modeling Time Series with Simple Discrete State Spaces

Michael Zhang · Khaled Saab · Michael Poli · Tri Dao · Karan Goel · Christopher Re

Warping the Space: Weight Space Rotation for Class-Incremental Few-Shot Learning

Do-Yeon Kim · Dong-Jun Han · Jun Seo · Jaekyun Moon

The Modality Focusing Hypothesis: Towards Understanding Crossmodal Knowledge Distillation

Zihui Xue · Zhengqi Gao · Sucheng Ren · Hang Zhao

Understanding the Covariance Structure of Convolutional Filters

Asher Trockman · Devin Willmott · Zico Kolter

Universal Few-shot Learning of Dense Prediction Tasks with Visual Token Matching

Donggyun Kim · Jinwoo Kim · Seongwoong Cho · Chong Luo · Seunghoon Hong

Hyperparameter Optimization through Neural Network Partitioning

Bruno Mlodozeniec · Matthias Reisser · Christos Louizos

Unsupervised Model Selection for Time Series Anomaly Detection

Mononito Goswami · Cristian Challu · Laurent Callot · Lenon Minorics · Andrey Kan

ContraNorm: A Contrastive Learning Perspective on Oversmoothing and Beyond

Xiaojun Guo · Yifei Wang · Tianqi Du · Yisen Wang

Bridging the Gap to Real-World Object-Centric Learning

Maximilian Seitzer · Max Horn · Andrii Zadaianchuk · Dominik Zietlow · Tianjun Xiao · Carl-Johann Simon-Gabriel · Tong He · Zheng Zhang · Bernhard Schoelkopf · Thomas Brox · Francesco Locatello

Disentanglement of Correlated Factors via Hausdorff Factorized Support

Karsten Roth · Mark Ibrahim · Zeynep Akata · Pascal Vincent · Diane Bouchacourt

EquiMod: An Equivariance Module to Improve Visual Instance Discrimination

Alexandre DEVILLERS · Mathieu Lefort

When to Make and Break Commitments?

Alihan Hüyük · Zhaozhi Qian · Mihaela van der Schaar

Block and Subword-Scaling Floating-Point (BSFP) : An Efficient Non-Uniform Quantization For Low Precision Inference

Yun-Chen Lo · Tse-Kuang Lee · Ren-Shuo Liu

A Statistical Framework for Personalized Federated Learning and Estimation: Theory, Algorithms, and Privacy

Kaan Ozkara · Antonious Bebawy · Deepesh Data · Suhas Diggavi

Blurring Diffusion Models

Emiel Hoogeboom · Tim Salimans

Neural Implicit Shape Editing using Boundary Sensitivity

Arturs Berzins · Moritz Ibing · Leif Kobbelt

MeshDiffusion: Score-based Generative 3D Mesh Modeling

Zhen Liu · Yao Feng · Michael J Black · Derek Nowrouzezahrai · Liam Paull · Weiyang Liu

Efficient Federated Domain Translation

Zeyu Zhou · Sheikh Shams Azam · Christopher Brinton · David Inouye

Betty: An Automatic Differentiation Library for Multilevel Optimization

Sang Choe · Willie Neiswanger · Pengtao Xie · Eric Xing

Winning Both the Accuracy of Floating Point Activation and the Simplicity of Integer Arithmetic

Yulhwa Kim · Jaeyong Jang · Jehun Lee · Jihoon Park · Jeonghoon Kim · Byeonguk Kim · baeseong park · Se Jung Kwon · Dongsoo Lee · jae-joon kim

SoftZoo: A Soft Robot Co-design Benchmark For Locomotion In Diverse Environments

Johnson (Tsun-Hsuan) Wang · Pingchuan Ma · Andrew Spielberg · Zhou Xian · Hao Zhang · Joshua B Tenenbaum · Daniela Rus · Chuang Gan

WikiWhy: Answering and Explaining Cause-and-Effect Questions

Matthew Ho · Aditya Sharma · Justin Chang · Michael Saxon · Sharon Levy · Yujie Lu · William Wang

DaxBench: Benchmarking Deformable Object Manipulation with Differentiable Physics

Siwei Chen · Yiqing Xu · Cunjun Yu · Linfeng Li · Xiao Ma · Zhongwen Xu · David Hsu

Equivariant Shape-Conditioned Generation of 3D Molecules for Ligand-Based Drug Design

Keir Adams · Connor Coley

Multiple sequence alignment as a sequence-to-sequence learning problem

Edo Dotan · Yonatan Belinkov · Oren Avram · Elya Wygoda · Noa Ecker · Michael Alburquerque · Omri Keren · Gil Loewenthal · Tal Pupko

Context-enriched molecule representations improve few-shot drug discovery

Johannes Schimunek · Philipp Seidl · Lukas Friedrich · Daniel Kuhn · Friedrich Rippmann · Sepp Hochreiter · Günter Klambauer

Interneurons accelerate learning dynamics in recurrent neural networks for statistical adaptation

David Lipshutz · Cengiz Pehlevan · Dmitri Chklovskii

SGDA with shuffling: faster convergence for nonconvex-PŁ minimax optimization

Hanseul Cho · Chulhee Yun

Trainable Weight Averaging: Efficient Training by Optimizing Historical Solutions

Tao Li · Zhehao Huang · Qinghua Tao · Yingwen Wu · Xiaolin Huang

Solving stochastic weak Minty variational inequalities without increasing batch size

Thomas Pethick · Olivier Fercoq · Puya Latafat · Panagiotis Patrinos · Volkan Cevher

Min-Max Multi-objective Bilevel Optimization with Applications in Robust Machine Learning

Alex Gu · Songtao Lu · Parikshit Ram · Tsui-Wei Weng

Fast Nonlinear Vector Quantile Regression

Aviv A. Rosenberg · Sanketh Vedula · Yaniv Romano · Alexander Bronstein

Meta Temporal Point Processes

Wonho Bae · Mohamed Ahmed · Frederick Tung · Gabriel Oliveira

Riemannian Metric Learning via Optimal Transport

Christopher Basil Scarvelis · Justin Solomon

Evolving Populations of Diverse RL Agents with MAP-Elites

Thomas PIERROT · Arthur Flajolet

Provably Efficient Risk-Sensitive Reinforcement Learning: Iterated CVaR and Worst Path

Yihan Du · Siwei Wang · Longbo Huang

Diffusion Policies as an Expressive Policy Class for Offline Reinforcement Learning

Zhendong Wang · Jonathan J Hunt · Mingyuan Zhou

Scaling Laws for a Multi-Agent Reinforcement Learning Model

Oren Neumann · Claudius Gros

Simplifying Model-based RL: Learning Representations, Latent-space Models, and Policies with One Objective

Raj Ghugare · Homanga Bharadhwaj · Benjamin Eysenbach · Sergey Levine · Russ Salakhutdinov

Impossibly Good Experts and How to Follow Them

Aaron Walsman · Muru Zhang · Sanjiban Choudhury · Dieter Fox · Ali Farhadi

Reward Design with Language Models

Minae Kwon · Sang Michael Xie · Kalesha Bullard · Dorsa Sadigh

Harnessing Mixed Offline Reinforcement Learning Datasets via Trajectory Weighting

Zhang-Wei Hong · Pulkit Agrawal · Remi Tachet des Combes · Romain Laroche

Visual Imitation Learning with Patch Rewards

Minghuan Liu · Tairan He · Weinan Zhang · shuicheng YAN · Zhongwen Xu

Backstepping Temporal Difference Learning

Han-Dong Lim · Donghwan Lee

Expressive Monotonic Neural Networks

Niklas Nolte · Ouail Kitouni · Mike Williams

Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation

Lorenz Kuhn · Yarin Gal · Sebastian Farquhar

Certified Training: Small Boxes are All You Need

Mark N Müller · Franziska Eckert · Marc Fischer · Martin Vechev

Confidential-PROFITT: Confidential PROof of FaIr Training of Trees

Ali Shahin Shamsabadi · Sierra Wyllie · Nicholas Franzese · Natalie Dullerud · Sébastien Gambs · Nicolas Papernot · Xiao Wang · Adrian Weller

MEDFAIR: Benchmarking Fairness for Medical Imaging

Yongshuo Zong · Yongxin Yang · Timothy Hospedales

Inequality phenomenon in $l_{\infty}$-adversarial training, and its unrealized threats

Ranjie Duan · YueFeng Chen · Yao Zhu · Xiaojun Jia · Rong Zhang · Hui Xue'

Disparate Impact in Differential Privacy from Gradient Misalignment

Maria Esipova · Atiyeh Ashari Ghomi · Yaqiao Luo · Jesse Cresswell

Causal Confusion and Reward Misidentification in Preference-Based Reward Learning

Jeremy Tien · Zhiyang He · Zackory Erickson · Anca Dragan · Daniel Brown

Panning for Gold in Federated Learning: Targeted Text Extraction under Arbitrarily Large-Scale Aggregation

Hong-Min Chu · Jonas Geiping · Liam H Fowl · Micah Goldblum · Tom Goldstein

Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student Settings and its Superiority to Kernel Methods

Akiyama Shunta · Taiji Suzuki

On The Specialization of Neural Modules

Devon Jarvis · Richard Klein · Benjamin Rosman · Andrew Saxe

Label-free Concept Bottleneck Models

Tuomas Oikarinen · Subhro Das · Lam Nguyen · Tsui-Wei Weng

Depth Separation with Multilayer Mean-Field Networks

Yunwei Ren · Mo Zhou · Rong Ge

Pitfalls of Gaussians as a noise distribution in NCE

Holden Lee · Chirag Pabbaraju · Anish Sevekari · Andrej Risteski

Near-optimal Coresets for Robust Clustering

Lingxiao Huang · Shaofeng Jiang · Jianing Lou · Xuan Wu

Towards convergence to Nash equilibria in two-team zero-sum games

Fivos Kalogiannis · Ioannis Panageas · Emmanouil-Vasileios Vlatakis-Gkaragkounis

Efficiently Computing Nash Equilibria in Adversarial Team Markov Games

Fivos Kalogiannis · Ioannis Anagnostides · Ioannis Panageas · Emmanouil-Vasileios Vlatakis-Gkaragkounis · Vaggos Chatziafratis · Stelios Stavroulakis

Robust Algorithms on Adaptive Inputs from Bounded Adversaries

Yeshwanth Cherapanamjeri · Sandeep Silwal · David Woodruff · Fred Zhang · Qiuyi Zhang · Samson Zhou

A view of mini-batch SGD via generating functions: conditions of convergence, phase transitions, benefit from negative momenta.

Maksim Velikanov · Denis Kuznedelev · Dmitry Yarotsky

Variance-Aware Sparse Linear Bandits

Yan Dai · Ruosong Wang · Simon Du

Strong inductive biases provably prevent harmless interpolation

Michael Aerni · Marco Milanta · Konstantin Donhauser · Fanny Yang

Subquadratic Algorithms for Kernel Matrices via Kernel Density Estimation

Ainesh Bakshi · Piotr Indyk · Praneeth Kacham · Sandeep Silwal · Samson Zhou

Forward Super-Resolution: How Can GANs Learn Hierarchical Generative Models for Real-World Distributions

Zeyuan Allen-Zhu · Yuanzhi Li

Plateau in Monotonic Linear Interpolation --- A "Biased" View of Loss Landscape for Deep Networks

Xiang Wang · Annie Wang · Mo Zhou · Rong Ge

Statistical Efficiency of Score Matching: The View from Isoperimetry

Frederic Koehler · Alexander Heckett · Andrej Risteski

Nonlinear Reconstruction for Operator Learning of PDEs with Discontinuities

Samuel Lanthaler · Roberto Molinaro · Patrik Hadorn · Siddhartha Mishra

Learning in temporally structured environments

Matt Jones · Tyler Scott · Mengye Ren · Gamaleldin Elsayed · Katherine Hermann · David Mayo · Michael Mozer

A Kernel Perspective of Skip Connections in Convolutional Networks

Daniel Barzilai · Amnon Geifman · Meirav Galun · Ronen Basri

Learning with Stochastic Orders

Carles Domingo i Enrich · Yair Schiff · Youssef Mroueh

DAVA: Disentangling Adversarial Variational Autoencoder

Benjamin Estermann · Roger Wattenhofer

Fake It Until You Make It : Towards Accurate Near-Distribution Novelty Detection

Hossein Mirzaei · Mohammadreza Salehi · Sajjad Shahabi · Efstratios Gavves · Cees G Snoek · Mohammad Sabokrou · Mohammad Hossein Rohban

Unsupervised 3D Object Learning through Neuron Activity aware Plasticity

Beomseok Kang · Biswadeep Chakraborty · Saibal Mukhopadhyay

Heterogeneous Neuronal and Synaptic Dynamics for Spike-Efficient Unsupervised Learning: Theory and Design Principles

Biswadeep Chakraborty · Saibal Mukhopadhyay

SlotFormer: Unsupervised Visual Dynamics Simulation with Object-Centric Models

Ziyi Wu · Nikita Dvornik · Klaus Greff · Thomas Kipf · Animesh Garg

A Message Passing Perspective on Learning Dynamics of Contrastive Learning

Yifei Wang · Qi Zhang · Tianqi Du · Jiansheng Yang · Zhouchen Lin · Yisen Wang

Universal Approximation Theorems for Differentiable Geometric Deep Learning

Anastasis Kratsios · Léonie Papon

Rethinking the Effect of Data Augmentation in Adversarial Contrastive Learning

Rundong Luo · Yifei Wang · Yisen Wang

Self-Supervised Geometric Correspondence for Category-Level 6D Object Pose Estimation in the Wild

Kaifeng Zhang · Yang Fu · Shubhankar Borse · Hong Cai · Fatih Porikli · Xiaolong Wang

Exploring The Role of Mean Teachers in Self-supervised Masked Auto-Encoders

Youngwan Lee · Jeff Willette · Jonghee Kim · Juho Lee · Sung Ju Hwang

Go to Event Page