Pay Attention to Features, Transfer Learn Faster CNNs

Kafeng Wang; Xitong Gao; Yiren Zhao; Xingjian Li; Dejing Dou; Cheng-Zhong Xu

Abstract: Deep convolutional neural networks are now widely deployed in vision applications, but a limited size of training data can restrict their task performance. Transfer learning offers the chance for CNNs to learn with limited data samples by transferring knowledge from models pretrained on large datasets. Blindly transferring all learned features from the source dataset, however, brings unnecessary computation to CNNs on the target task. In this paper, we propose attentive feature distillation and selection (AFDS), which not only adjusts the strength of transfer learning regularization but also dynamically determines the important features to transfer. By deploying AFDS on ResNet-101, we achieved a state-of-the-art computation reduction at the same accuracy budget, outperforming all existing transfer learning methods. With a 10x MACs reduction budget, a ResNet-101 equipped with AFDS transfer learned from ImageNet to Stanford Dogs 120, can achieve an accuracy 11.07% higher than its best competitor.

Pay Attention to Features, Transfer Learn Faster CNNs

Kafeng Wang, Xitong Gao, Yiren Zhao, Xingjian Li, Dejing Dou, Cheng-Zhong Xu

Similar Papers

Contrastive Representation Distillation

Yonglong Tian, Dilip Krishnan, Phillip Isola,

Adversarial AutoAugment

Xinyu Zhang, Qiang Wang, Jian Zhang, Zhao Zhong,

SNOW: Subscribing to Knowledge via Channel Pooling for Transfer & Lifelong Learning of Convolutional Neural Networks

Chungkuk Yoo, Bumsoo Kang, Minsik Cho,