Skip to yearly menu bar Skip to main content


AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding

Ahmed Masry ⋅ Juan A. Rodriguez ⋅ Tianyu Zhang ⋅ Suyuchen Wang ⋅ Chao Wang ⋅ Aarash Feizi ⋅ Akshay Suresh ⋅ Abhay Puri ⋅ Xiangru Jian ⋅ Pierre-André Noël ⋅ Sathwik Tejaswi Madhusudhan ⋅ Marco Pedersoli ⋅ Bang Liu ⋅ Nicolas Chapados ⋅ Yoshua Bengio ⋅ Enamul Hoque ⋅ Christopher Pal ⋅ Issam Laradji ⋅ David Vazquez ⋅ Perouz Taslakian ⋅ Spandana Gella ⋅ Sai Rajeswar

Abstract

Video

Chat is not available.