Skip to yearly menu bar Skip to main content


(7 events)   Timezone:  
Show all
Toggle Poster Visibility
Oral
Fri Apr 24 11:15 AM -- 11:25 AM (PDT) @ 202 A/B None
Learning to See Before Seeing: Demystifying LLM Visual Priors from Language Pre-training
Junlin Han ⋅ Shengbang Tong ⋅ David Fan ⋅ Yufan Ren ⋅ Koustuv Sinha ⋅ Philip Torr ⋅ Filippos Kokkinos
[ OpenReview
Oral
Fri Apr 24 11:27 AM -- 11:37 AM (PDT) @ 202 A/B None
Hallucination Begins Where Saliency Drops
Xiaofeng Zhang ⋅ Yuanchao Zhu ⋅ Chaochen Gu ⋅ Xiaosong Yuan ⋅ Qiyan Zhao ⋅ Jiawei Cao ⋅ Barrett Tang ⋅ Sinan Fan ⋅ Yaomin Shen ⋅ Chen Shen ⋅ Hao Tang
[ Slides [ OpenReview
Oral
Fri Apr 24 11:39 AM -- 11:49 AM (PDT) @ 202 A/B None
Through the Lens of Contrast: Self-Improving Visual Reasoning in VLMs
Zhiyu Pan ⋅ Yizheng Wu ⋅ Jiashen Hua ⋅ Junyi Feng ⋅ Shaotian Yan ⋅ Bing Deng ⋅ Zhiguo Cao ⋅ Jieping Ye
[ OpenReview
Oral
Fri Apr 24 11:51 AM -- 12:01 PM (PDT) @ 202 A/B None
MetaEmbed: Scaling Multimodal Retrieval at Test-Time with Flexible Late Interaction
Zilin Xiao ⋅ Qi Ma ⋅ Mengting Gu ⋅ Chun-cheng Chen ⋅ Xintao Chen ⋅ Vicente Ordonez ⋅ Vijai Mohan
[ OpenReview
Oral
Fri Apr 24 12:03 PM -- 12:13 PM (PDT) @ 202 A/B None
Seeing Through the Brain: New Insights from Decoding Visual Stimuli with fMRI
Zheng Huang ⋅ Enpei Zhang ⋅ Weikang Qiu ⋅ Yinghao Cai ⋅ Carl Yang ⋅ Elynn Chen ⋅ Xiang Zhang ⋅ Rex Ying ⋅ Dawei Zhou ⋅ Yujun Yan
[ OpenReview
Oral
Fri Apr 24 12:15 PM -- 12:25 PM (PDT) @ 202 A/B None
WAVE: Learning Unified & Versatile Audio-Visual Embeddings with Multimodal LLM
Changli Tang ⋅ Qinfan Xiao ⋅ Ke Mei ⋅ Tianyi Wang ⋅ Fengyun Rao ⋅ Chao Zhang
[ OpenReview
Oral
Fri Apr 24 12:27 PM -- 12:37 PM (PDT) @ 202 A/B None
Visual symbolic mechanisms: Emergent symbol processing in Vision Language Models
Rim Assouel ⋅ Declan Campbell ⋅ Yoshua Bengio ⋅ Taylor Webb
[ OpenReview