Poster
in
Workshop: Navigating and Addressing Data Problems for Foundation Models (DPFM)
Multimodal Dataset Upgrading: a New Challenge for Data Annotation
Haiwen Huang · Dan Zhang · Andreas Geiger
Keywords: [ data upgrading ] [ data annotation ] [ Multimodal ]
In recent years, many large-scale datasets become available, yet their annotations are coarse and noisy. In this paper, we propose a novel task of multimodal dataset upgrading to enhance the quality of multimodal annotations. Distinguishing from traditional annotation efforts that focus on creating labels from scratch, multimodal dataset upgrading seeks to refine existing annotations by increasing annotation granularity, reducing errors, and improving multimodal alignment. We propose a framework for tackling multimodal data upgrading, consisting of generating candidates for upgrading and cross-modality matching to select the upgraded data. We further provide a case study on open-vocabulary segmentation datasets where by improving the class name quality, we achieve significant performance enhancements in state-of-the-art open-vocabulary segmentation models. As an initial exploration, we hope this paper showcases the benefits of data upgrading and opens up new avenues for research in data problems for foundation models.