ICLR 2023
Skip to yearly menu bar Skip to main content


Multimodal Representation Learning (MRL): Perks and Pitfalls

Adrián Javaloy · Miguel Vasco · Imant Daunhawer · Petra Poklukar · Yuge Shi · Danica Kragic · Isabel Valera


Following deep learning, multimodal machine learning has made steady progress, becoming ubiquitous in many domains. Learning representations from multiple modalities can be beneficial since different perceptual modalities can inform each other and ground abstract phenomena in a more robust, generalisable way. However, the complexity of different modalities can hinder the training process, requiring careful design of the model in order to learn meaningful representations. In light of these seemingly conflicting aspects of multimodal learning, we must improve our understanding of what makes each modality different, how they interact, and what are the desiderata of multimodal representations. With this workshop, we aim to bring the multimodal community together, promoting work on multimodal representation learning that provides systematic insights into the nature of the learned representations, as well as ways to improve and understand the training of multimodal models, both from a theoretical and empirical point of view.In particular, we focus on the following questions:(Representation) How do we identify useful properties of multimodal representations?(Training) How can we promote useful properties of multimodal representations?(Modalities) What makes a modality different? How can we improve their interactions?The MRL workshop has an objective to bring together experts from the multimodal learning community in order to advance these fundamental questions and discuss the future of the field. We invite submissions that present analysis of the properties of multimodal representations, insights on interactions across modalities, as well as novel applications regarding the nature and number of modalities employed.

Chat is not available.
Timezone: America/Los_Angeles