firstbacksecondback
220 Results
Poster
|
Learning to Decompose Visual Features with Latent Textual Prompts Feng Wang · Manling Li · Xudong Lin · Hairong Lv · Alex Schwing · Heng Ji |
||
Poster
|
Universal Vision-Language Dense Retrieval: Learning A Unified Representation Space for Multi-Modal Retrieval Zhenghao Liu · Chenyan Xiong · Yuanhuiyi Lv · Zhiyuan Liu · Ge Yu |
||
Poster
|
PaLI: A Jointly-Scaled Multilingual Language-Image Model Xi Chen · Xiao Wang · Soravit Changpinyo · AJ Piergiovanni · Piotr Padlewski · Daniel Salz · Sebastian Goodman · Adam Grycner · Basil Mustafa · Lucas Beyer · Alexander Kolesnikov · Joan Puigcerver · Nan Ding · Keran Rong · Hassan Akbari · Gaurav Mishra · Linting Xue · Ashish V. Thapliyal · James Bradbury · Weicheng Kuo · Mojtaba Seyedhosseini · Chao Jia · Burcu Karagol Ayan · Carlos Riquelme · Andreas Steiner · Anelia Angelova · Xiaohua Zhai · Neil Houlsby · Radu Soricut |
||
Poster
|
Wed 7:30 |
Write and Paint: Generative Vision-Language Models are Unified Modal Learners Shizhe Diao · Wangchunshu Zhou · Xinsong Zhang · Jiawei Wang |
|
Poster
|
Wed 7:30 |
CLIPSep: Learning Text-queried Sound Separation with Noisy Unlabeled Videos Hao-Wen Dong · Naoya Takahashi · Yuki Mitsufuji · Julian McAuley · Taylor Berg-Kirkpatrick |
|
Workshop
|
Thu 4:00 |
Coordinating Multiple Vision-Language Models for Visual Reasoning Liangyu Chen · Bo Li · Sheng Shen · Jingkang Yang · Chunyuan Li · Kurt Keutzer · trevor darrell · Ziwei Liu |
|
Workshop
|
Thu 4:00 |
Variational prompt tuning improves generalization of vision-language foundation models Mohammad Mahdi Derakhshani · Enrique Sanchez · Adrian Bulat · Victor Guilherme Turrisi da Costa · Cees G Snoek · Georgios Tzimiropoulos · Brais Martinez |
|
Poster
|
ViewCo: Discovering Text-Supervised Segmentation Masks via Multi-View Semantic Consistency Pengzhen Ren · Changlin Li · Hang Xu · Yi Zhu · Guangrun Wang · Jianzhuang Liu · Xiaojun Chang · Xiaodan Liang |
||
Oral
|
Wed 1:50 |
Visual Classification via Description from Large Language Models Sachit Menon · Carl Vondrick |
|
Poster
|
Tue 2:30 |
Programmatically Grounded, Compositionally Generalizable Robotic Manipulation Renhao Wang · Jiayuan Mao · Joy Hsu · Hang Zhao · Jiajun Wu · Yang Gao |
|
Workshop
|
Enabling Calibration In The Zero-Shot Inference of Large Vision-Language Models Will LeVine · Benjamin Pikus · Pranav Raja · Fernando Amat |
||
Poster
|
Wed 2:30 |
Visual Classification via Description from Large Language Models Sachit Menon · Carl Vondrick |