Skip to yearly menu bar Skip to main content


Poster

VCR: A Task for Pixel-Level Complex Reasoning in Vision Language Models via Restoring Occluded Text

Tianyu Zhang ⋅ Suyuchen Wang ⋅ Lu Li ⋅ Ge Zhang ⋅ Perouz Taslakian ⋅ Sai Rajeswar ⋅ Jie Fu ⋅ Bang Liu ⋅ Yoshua Bengio
2025 Poster

Abstract

Video

Chat is not available.