Despite encouraging progress in embodied learning over the past two decades, there is still a large gap between embodied agents' perception and human perception. Humans have remarkable capabilities combining all our multisensory inputs. To close the gap, embodied agents should also be enabled to see, hear, touch, and interact with their surroundings in order to select the appropriate actions. However, today's learning algorithms primarily operate on a single modality. In order for Artificial Intelligence to make progress in understanding the world around us, it needs to be able to interpret such multimodal signals jointly. The goal of this workshop is to share recent progress and discuss current challenges on embodied learning with multiple modalities.
The EML workshop will bring together researchers in different subareas of embodied multimodal learning including computer vision, robotics, machine learning, natural language processing, and cognitive science to examine the challenges and opportunities emerging from the design of embodied agents that unify their multisensory inputs. We will review the current state and identify the research infrastructure needed to enable a stronger collaboration between researchers working on different modalities.
Fri 7:55 a.m. - 8:00 a.m.
|
Introduction and Opening Remarks
(
Opening Remarks
)
|
🔗 |
Fri 8:00 a.m. - 8:01 a.m.
|
Invited Talk 1 Speaker Introduction
(
Live Introduction
)
|
🔗 |
Fri 8:01 a.m. - 8:27 a.m.
|
Invited Talk 1
(
Invited Talk
)
|
Katherine J. Kuchenbecker 🔗 |
Fri 8:27 a.m. - 8:30 a.m.
|
Invited Talk 1 Q&A
(
Live Q&A
)
|
🔗 |
Fri 8:30 a.m. - 8:31 a.m.
|
Invited Talk 2 Speaker Introduction
(
Live Introduction
)
|
🔗 |
Fri 8:31 a.m. - 8:57 a.m.
|
Invited Talk 2
(
Invited Talk
)
SlidesLive Video » |
Danica Kragic 🔗 |
Fri 8:57 a.m. - 9:00 a.m.
|
Invited Talk 2 Q&A
(
Live Q&A
)
|
🔗 |
Fri 9:00 a.m. - 9:06 a.m.
|
Paper Session 1 - Paper 1
(
Paper Talks
)
SlidesLive Video » ABC Problem: An Investigation of Offline RL for Vision-Based Dynamic Manipulation |
Kamyar Ghassemipour 🔗 |
Fri 9:06 a.m. - 9:12 a.m.
|
Paper Session 1 - Paper 2
(
Paper Talks
)
SlidesLive Video » Language Acquisition is Embodied, Interactive, and Emotive |
Casey Kennington 🔗 |
Fri 9:12 a.m. - 9:18 a.m.
|
Paper Session 1 - Paper 3
(
Paper Talks
)
SlidesLive Video » Ask & Explore: Grounded Question Answering for Curiosity-Driven Exploration |
Paul Pu Liang 🔗 |
Fri 9:18 a.m. - 9:24 a.m.
|
Paper Session 1 - Paper 4
(
Paper Talks
)
SlidesLive Video » TOWARDS TEACHING MACHINES WITH LANGUAGE: INTERACTIVE LEARNING FROM ONLY LANGUAGE DESCRIPTIONS OF ACTIVITIES |
Khanh Nguyen 🔗 |
Fri 9:24 a.m. - 9:30 a.m.
|
Paper Session 1 - Paper 5
(
Paper Talks
)
SlidesLive Video » YouRefIt: Embodied Reference Understanding with Language and Gesture |
Yixin Chen 🔗 |
Fri 9:30 a.m. - 9:40 a.m.
|
Paper Session 1 Q&A
(
Paper Talks Live Q&A
)
|
🔗 |
Fri 9:40 a.m. - 10:00 a.m.
|
Break 1
|
🔗 |
Fri 10:00 a.m. - 10:01 a.m.
|
Invited Talk 3 Speaker Introduction
(
Live Introduction
)
|
🔗 |
Fri 10:01 a.m. - 10:27 a.m.
|
Invited Talk 3
(
Invited Talk
)
|
Linda Smith 🔗 |
Fri 10:27 a.m. - 10:30 a.m.
|
Invited Talk 3 Q&A
(
Live Q&A
)
|
🔗 |
Fri 10:30 a.m. - 10:31 a.m.
|
Invited Talk 4 Speaker Introduction
(
Live Introduction
)
|
🔗 |
Fri 10:31 a.m. - 10:57 a.m.
|
Invited Talk 4
(
Invited Talk
)
|
Felix Hill 🔗 |
Fri 10:57 a.m. - 11:00 a.m.
|
Invited Talk 4 Q&A
(
Live Q&A
)
|
🔗 |
Fri 11:00 a.m. - 12:00 p.m.
|
Panel Discussion
|
🔗 |
Fri 12:00 p.m. - 12:30 p.m.
|
Break 2
|
🔗 |
Fri 12:30 p.m. - 12:31 p.m.
|
Invited Talk 5 Speaker Introduction
(
Live Introduction
)
|
🔗 |
Fri 12:31 p.m. - 12:57 p.m.
|
Invited Talk 5
(
Invited Talk
)
|
Abhinav Gupta 🔗 |
Fri 12:57 p.m. - 1:00 p.m.
|
Invited Talk 5 Q&A
(
Live Q&A
)
|
🔗 |
Fri 1:00 p.m. - 1:01 p.m.
|
Invited Talk 6 Speaker Introduction
(
Live Introduction
)
|
🔗 |
Fri 1:01 p.m. - 1:27 p.m.
|
Invited Talk 6
(
Invited Talk
)
|
Sergey Levine 🔗 |
Fri 1:27 p.m. - 1:30 p.m.
|
Invited Talk 6 Q&A
(
Live Q&A
)
|
🔗 |
Fri 1:30 p.m. - 1:36 p.m.
|
Paper Session 2 - Paper 6
(
Paper Talks
)
SlidesLive Video » Learning to Set Waypoints for Audio-Visual Navigation |
Changan Chen 🔗 |
Fri 1:36 p.m. - 1:42 p.m.
|
Paper Session 2 - Paper 7
(
Paper Talks
)
SlidesLive Video » Semantic Audio-Visual Navigation |
Changan Chen 🔗 |
Fri 1:42 p.m. - 1:48 p.m.
|
Paper Session 2 - Paper 8
(
Paper Talks
)
SlidesLive Video » Attentive Feature Reuse for Multi-Task Meta-learning |
Kiran Lekkala 🔗 |
Fri 1:48 p.m. - 1:54 p.m.
|
Paper Session 2 - Paper 9
(
Paper Talks
)
SlidesLive Video » SeLaVi: self-labelling videos without any annotations from scratch |
Yuki Asano 🔗 |
Fri 2:00 p.m. - 2:10 p.m.
|
Paper Session 2 Q&A
(
Paper Talks Live Q&A
)
|
🔗 |
Fri 2:10 p.m. - 2:30 p.m.
|
Break 3
|
🔗 |
Fri 2:30 p.m. - 2:31 p.m.
|
Invited Talk 7 Speaker Introduction
(
Live Introduction
)
|
🔗 |
Fri 2:31 p.m. - 2:57 p.m.
|
Invited Talk 7
(
Invited Talk
)
|
Jitendra Malik 🔗 |
Fri 2:57 p.m. - 3:00 p.m.
|
Invited Talk 7 Q&A
(
Live Q&A
)
|
🔗 |
Fri 3:00 p.m. - 3:01 p.m.
|
Invited Talk 8 Speaker Introduction
(
Live Introduction
)
|
🔗 |
Fri 3:01 p.m. - 3:27 p.m.
|
Invited Talk 8
(
Invited Talk
)
|
Claudia D'Arpino 🔗 |
Fri 3:27 p.m. - 3:30 p.m.
|
Invited Talk 8 Q&A
(
Live Q&A
)
|
🔗 |
Fri 3:30 p.m. - 3:35 p.m.
|
Closing Remarks
|
🔗 |