Skip to yearly menu bar Skip to main content


(7 events)   Timezone:  
Show all
Toggle Poster Visibility
Oral
Thu Apr 23 11:15 AM -- 11:25 AM (PDT) @ 202 A/B None
Reasoning as Representation: Rethinking Visual Reinforcement Learning in Image Quality Assessment
Shijie Zhao ⋅ Xuanyu Zhang ⋅ Weiqi Li ⋅ Junlin Li ⋅ Li zhang ⋅ Tianfan Xue ⋅ Jian Zhang
[ OpenReview
Oral
Thu Apr 23 11:27 AM -- 11:37 AM (PDT) @ 202 A/B None
Veritas: Generalizable Deepfake Detection via Pattern-Aware Reasoning
Hao Tan ⋅ jun lan ⋅ Zichang Tan ⋅ Senyuan Shi ⋅ Ajian Liu ⋅ Chuanbiao Song ⋅ Huijia Zhu ⋅ Weiqiang Wang ⋅ Jun Wan ⋅ Zhen Lei
[ OpenReview
Oral
Thu Apr 23 11:39 AM -- 11:49 AM (PDT) @ 202 A/B None
On the Generalization Capacities of MLLMs for Spatial Intelligence
Gongjie Zhang ⋅ Wenhao Li ⋅ Quanhao Qian ⋅ Jiuniu Wang ⋅ Deli Zhao ⋅ Shijian Lu ⋅ Ran Xu
[ OpenReview
Oral
Thu Apr 23 11:51 AM -- 12:01 PM (PDT) @ 202 A/B None
DepthLM: Metric Depth from Vision Language Models
zhipeng cai ⋅ Ching-Feng Yeh ⋅ Hu Xu ⋅ Zhuang Liu ⋅ Gregory P. Meyer ⋅ Xinjie Lei ⋅ Changsheng Zhao ⋅ Shang-Wen Li ⋅ Vikas Chandra ⋅ Yangyang Shi
[ Slides [ OpenReview
Oral
Thu Apr 23 12:03 PM -- 12:13 PM (PDT) @ 202 A/B None
FlashVID: Efficient Video Large Language Models via Training-free Tree-based Spatiotemporal Token Merging
Ziyang Fan ⋅ Keyu Chen ⋅ Ruilong Xing ⋅ Yulin Li ⋅ Li Jiang ⋅ Zhuotao Tian
[ OpenReview
Oral
Thu Apr 23 12:15 PM -- 12:25 PM (PDT) @ 202 A/B None
Multimodal Aligned Semantic Knowledge for Unpaired Image-text Matching
Laiguo Yin ⋅ Yixin Zhang ⋅ YUQING SUN ⋅ Lizhen Cui
[ OpenReview
Oral
Thu Apr 23 12:27 PM -- 12:37 PM (PDT) @ 202 A/B None
Vid-LLM: A Compact Video-based 3D Multimodal LLM with Reconstruction–Reasoning Synergy
Haijier Chen ⋅ Bo Xu ⋅ Shoujian zhang ⋅ Haoze Liu ⋅ Jiaxuan Lin ⋅ Jingrong Wang
[ OpenReview