Skip to yearly menu bar Skip to main content


Virtual presentation / top 25% paper

Using Language to Extend to Unseen Domains

Lisa Dunlap · Clara Mohri · Devin Guillory · Han Zhang · trevor darrell · Joseph E Gonzalez · Aditi Raghunathan · Anna Rohrbach

Keywords: [ vision and language ] [ domain adaptation ] [ robust training ] [ Applications ]


Abstract: It is expensive to collect training data for every possible domain that a vision model may encounter when deployed. We instead consider how simply $\textit{verbalizing}$ the training domain (e.g.``photos of birds'') as well as domains we want to extend to but do not have data for (e.g.``paintings of birds'') can improve robustness. Using a multimodal model with a joint image and language embedding space, our method $\textit{LADS}$ learns a transformation of the image embeddings from the source domain to each target domain, while preserving task relevant information. Without using any images from the target domain, we show that over the $\textit{extended}$ domain containing both source and target, $\textit{LADS}$ outperforms standard fine-tuning and ensemble approaches over a suite of 4 benchmarks targeting domain adaptation and dataset bias.

Chat is not available.