Learning Transferable Reward for Query Object Localization with Policy Adaptation

Tingfeng Li · Shaobo Han · Martin Min · Dimitris Metaxas

[ Abstract ]
[ Visit Poster at Spot J3 in Virtual World ] [ Slides [ OpenReview
Mon 25 Apr 10:30 a.m. PDT — 12:30 p.m. PDT


We propose a reinforcement learning based approach to query object localization, for which an agent is trained to localize objects of interest specified by a small exemplary set. We learn a transferable reward signal formulated using the exemplary set by ordinal metric learning. Our proposed method enables test-time policy adaptation to new environments where the reward signals are not readily available, and outperforms fine-tuning approaches that are limited to annotated images. In addition, the transferable reward allows repurposing the trained agent from one specific class to another class. Experiments on corrupted MNIST, CU-Birds, and COCO datasets demonstrate the effectiveness of our approach.

Chat is not available.