Advertisement

Gated Transfer Network for Transfer Learning

  • Yi ZhuEmail author
  • Jia Xue
  • Shawn Newsam
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11364)

Abstract

Deep neural networks have led to a series of breakthroughs in computer vision given sufficient annotated training datasets. For novel tasks with limited labeled data, the prevalent approach is to transfer the knowledge learned in the pre-trained models to the new tasks by fine-tuning. Classic model fine-tuning utilizes the fact that well trained neural networks appear to learn cross domain features. These features are treated equally during transfer learning. In this paper, we explore the impact of feature selection in model fine-tuning by introducing a transfer module, which assigns weights to features extracted from pretrained models. The proposed transfer module proves the importance of feature selection for transferring models from source to target domains. It is shown to significantly improve upon fine-tuning results with only marginal extra computational cost. We also incorporate an auxiliary classifier as an extra regularizer to avoid over-fitting. Finally, we build a Gated Transfer Network (GTN) based on our transfer module and achieve state-of-the-art results on six different tasks.

Keywords

Transfer learning Feature selection Sparse coding 

Notes

Acknowledgement

We gratefully acknowledge the support of NVIDIA Corporation through the donation of the Titan Xp GPUs used in this work.

References

  1. 1.
    FashionAI Global Challenge: Make AI Insight to Fashion. fashionai.alibaba.com/ (2018)
  2. 2.
    Bell, S., Upchurch, P., Snavely, N., Bala, K.: Material recognition in the wild with the materials in context database. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  3. 3.
    Branson, S., Horn, G.V., Belongie, S., Perona, P.: Bird species categorization using pose normalized deep convolutional nets. In: British Machine Vision Conference (BMVC) (2014)Google Scholar
  4. 4.
    Cimpoi, M., Maji, S., Kokkinos, I., Mohamed, S., Vedaldi, A.: Describing textures in the wild. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  5. 5.
    Cimpoi, M., Maji, S., Vedaldi, A.: Deep filter banks for texture recognition and segmentation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  6. 6.
    Cui, Y., Song, Y., Sun, C., Howard, A., Belongie, S.: Large scale fine-grained categorization and domain-specific transfer learning. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  7. 7.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2009)Google Scholar
  8. 8.
    Ge, W., Yu, Y.: Borrowing treasures from the wealthy: deep transfer learning through selective joint fine-tuning. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  9. 9.
    Gong, Y., Wang, L., Guo, R., Lazebnik, S.: Multi-scale orderless pooling of deep convolutional activation features. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8695, pp. 392–407. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10584-0_26CrossRefGoogle Scholar
  10. 10.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  11. 11.
    Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  12. 12.
    Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  13. 13.
    Lan, Z., Zhu, Y., Hauptmann, A.G., Newsam, S.: Deep local video feature for action recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  14. 14.
    Li, Z., Hoiem, D.: Learning without forgetting. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 614–629. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_37CrossRefGoogle Scholar
  15. 15.
    Liu, J., Wang, Y., Qiao, Y.: Sparse deep transfer learning for convolutional neural network. In: Association for the Advancement of Artificial Intelligence (AAAI) (2017)Google Scholar
  16. 16.
    Miao, X., Zhen, X., Liu, X., Deng, C., Athitsos, V., Huang, H.: Direct shape regression networks for end-to-end face alignment. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  17. 17.
    Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Learning and transferring mid-level image representations using convolutional neural networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  18. 18.
    Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2009)Google Scholar
  19. 19.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (ICLR) (2015)Google Scholar
  20. 20.
    Soomro, K., Zamir, A.R., Shah, M.: UCF101: a dataset of 101 human action classes from videos in the wild. In: CRCV-TR-12-01 (2012)Google Scholar
  21. 21.
    Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. (JMLR) 15, 1929–1958 (2014)MathSciNetzbMATHGoogle Scholar
  22. 22.
    Sun, C., Shrivastava, A., Singh, S., Gupta, A.: Revisiting unreasonable effectiveness of data in deep learning era. In: International Conference on Computer Vision (ICCV) (2017)Google Scholar
  23. 23.
    Szegedy, C., et al.: Going deeper with convolutions. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  24. 24.
    Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  25. 25.
    Van Der Maaten, L.: Accelerating T-SNE using tree-based algorithms. J. Mach. Learn. Res. (JMLR) 15(1), 3221–3245 (2014)MathSciNetzbMATHGoogle Scholar
  26. 26.
    Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200-2011 Dataset (2011)Google Scholar
  27. 27.
    Wang, Y.X., Ramanan, D., Hebert, M.: Growing a brain: fine-tuning by increasing model capacity. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  28. 28.
    Xue, J., Zhang, H., Dana, K.: Deep texture manifold for ground terrain recognition. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  29. 29.
    Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? In: Neural Information Processing Systems (NIPS) (2014)Google Scholar
  30. 30.
    Zhang, H., Xue, J., Dana, K.: Deep TEN: texture encoding network. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  31. 31.
    Zhang, N., Donahue, J., Girshick, R., Darrell, T.: Part-based R-CNNs for fine-grained category detection. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 834–849. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10590-1_54CrossRefGoogle Scholar
  32. 32.
    Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. In: Neural Information Processing Systems (NIPS) (2014)Google Scholar
  33. 33.
    Zhu, Y., Lan, Z., Newsam, S., Hauptmann, A.G.: Hidden two-stream convolutional networks for action recognition. In: Asian Conference on Computer Vision (ACCV) (2018)Google Scholar
  34. 34.
    Zhu, Y., Long, Y., Guan, Y., Newsam, S., Shao, L.: Towards universal representation for unseen action recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)Google Scholar
  35. 35.
    Zhu, Y., Newsam, S.: Depth2Action: exploring embedded depth for large-scale action recognition. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9913, pp. 668–684. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46604-0_47CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.University of California at MercedMercedUSA
  2. 2.Rutgers UniversityNew BrunswickUSA

Personalised recommendations