Skip to yearly menu bar Skip to main content


ImageNet-Think-250K: A Large-Scale Synthetic Dataset for Multimodal Reasoning for Vision Language Models

Krishna Teja Chitty-Venkata ⋅ Murali Emani

Abstract

Chat is not available.