firstbacksecondback
4127 Results
Poster
|
Fri 19:00 |
Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision Orr Zohar · Xiaohan Wang · Yonatan Bitton · Idan Szpektor · Serena Yeung |
|
Poster
|
Wed 19:00 |
VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks Lawrence Jang · Yinheng Li · Dan Zhao · Charles Ding · Justin Lin · Paul Pu Liang · Rogerio Bonatti · Kazuhito Koishida |
|
Poster
|
Thu 19:00 |
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation Tianchen Zhao · Tongcheng Fang · Haofeng Huang · Rui Wan · Widyadewi Soedarmadji · Enshu Liu · Shiyao Li · Zinan Lin · Guohao Dai · Shengen Yan · Huazhong Yang · Xuefei Ning · Yu Wang |
|
Poster
|
Wed 19:00 |
ViSAGe: Video-to-Spatial Audio Generation Jaeyeon Kim · Heeseung Yun · Gunhee Kim |
|
Poster
|
Sat 0:00 |
Vision CNNs trained to estimate spatial latents learned similar ventral-stream-aligned representations Yudi Xie · Weichen Huang · Esther Alter · Jeremy Schwartz · Joshua B Tenenbaum · James DiCarlo |
|
Poster
|
Fri 0:00 |
VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents Xiao Liu · Tianjie Zhang · Yu Gu · Iat Long Iong · Song XiXuan · Yifan Xu · Shudan Zhang · Hanyu Lai · Jiadai Sun · Xinyue Yang · Yu Yang · Zehan Qi · Shuntian Yao · Xueqiao Sun · Siyi Cheng · Qinkai Zheng · Hao Yu · Hanchen Zhang · Wenyi Hong · Ming Ding · Lihang Pan · Xiaotao Gu · Aohan Zeng · Zhengxiao Du · Chan Hee Song · Yu Su · Yuxiao Dong · Jie Tang |
|
Poster
|
Thu 19:00 |
Visual Agents as Fast and Slow Thinkers Guangyan Sun · Mingyu Jin · Zhenting Wang · Chenglong Wang · Siqi Ma · Qifan Wang · Tong Geng · Yingnian Wu · Yongfeng Zhang · Dongfang Liu |
|
Poster
|
Fri 19:00 |
Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark Tsung-Han Wu · Giscard Biamby · Jerome Quenum · Ritwik Gupta · Joseph E Gonzalez · trevor darrell · David Chan |
|
Poster
|
Wed 19:00 |
Visually Consistent Hierarchical Image Classification Seulki Park · Youren Zhang · Stella Yu · Sara Beery · Jonathan Huang |
|
Poster
|
Fri 0:00 |
Visually Guided Decoding: Gradient-Free Hard Prompt Inversion with Language Models Donghoon Kim · Minji Bae · Kyuhong Shim · Byonghyo Shim |
|
Poster
|
Sat 0:00 |
Visual-O1: Understanding Ambiguous Instructions via Multi-modal Multi-turn Chain-of-thoughts Reasoning Minheng Ni · YuTao Fan · Lei Zhang · Wangmeng Zuo |
|
Poster
|
Wed 19:00 |
VLMaterial: Procedural Material Generation with Large Vision-Language Models Beichen Li · Rundi Wu · Armando Solar-Lezama · Changxi Zheng · Liang Shi · Bernd Bickel · Wojciech Matusik |