Search All 2021 Events
  Search abstracts

Filter by Keyword:

39 Results

<<   <   Page 1 of 4   >   >>
Oral
Wed 3:30 Share or Not? Learning to Schedule Language-Specific Capacity for Multilingual Translation
Biao Zhang · Ankur Bapna · Rico Sennrich · Orhan Firat
Poster
Tue 9:00 Transformer protein language models are unsupervised structure learners
Roshan Rao · Joshua Meier · Tom Sercu · Sergey Ovchinnikov · Alexander Rives
Oral
Wed 3:00 An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy · Lucas Beyer · Alexander Kolesnikov · Dirk Weissenborn · Xiaohua Zhai · Thomas Unterthiner · Mostafa Dehghani · Matthias Minderer · Georg Heigold · Sylvain Gelly · Jakob Uszkoreit · Neil Houlsby
Poster
Tue 17:00 Hopper: Multi-hop Transformer for Spatiotemporal Reasoning
Honglu Zhou · Asim Kadav · Farley Lai · Alexandru Niculescu-Mizil · Martin Min · Mubbasir Kapadia · Hans P Graf
Poster
Mon 1:00 Predicting Infectiousness for Proactive Contact Tracing
Yoshua Bengio · Prateek Gupta · Tegan Maharaj · Nasim Rahaman · Martin Weiss · Tristan Deleu · Eilif B Muller · Meng Qu · victor schmidt · Pierre-luc St-charles · hannah alsdurf · Olexa Bilaniuk · david buckeridge · Gaétan Marceau Caron · pierre carrier · Joumana Ghosn · satya gagne · Chris J Pal · Irina Rish · Bernhard Schoelkopf · abhinav sharma · J
Workshop
Fri 10:05 Transformer Language Models as Universal Computation Engines
Kevin Lu
Poster
Wed 1:00 An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy · Lucas Beyer · Alexander Kolesnikov · Dirk Weissenborn · Xiaohua Zhai · Thomas Unterthiner · Mostafa Dehghani · Matthias Minderer · Georg Heigold · Sylvain Gelly · Jakob Uszkoreit · Neil Houlsby
Poster
Mon 9:00 LambdaNetworks: Modeling long-range Interactions without Attention
Irwan Bello
Poster
Tue 1:00 A Trainable Optimal Transport Embedding for Feature Aggregation and its Relationship to Attention
Grégoire Mialon · Dexiong Chen · Alexandre d'Aspremont · Julien Mairal
Poster
Thu 9:00 Efficient Transformers in Reinforcement Learning using Actor-Learner Distillation
Emilio Parisotto · Ruslan Salakhutdinov
Spotlight
Wed 12:48 LambdaNetworks: Modeling long-range Interactions without Attention
Irwan Bello
Poster
Wed 9:00 HyperGrid Transformers: Towards A Single Model for Multiple Tasks
Yi Tay · Zhe Zhao · Dara Bahri · Donald Metzler · Da-Cheng Juan