Advertisement

Target Aware Network Adaptation for Efficient Representation Learning

  • Yang ZhongEmail author
  • Vladimir Li
  • Ryuzo Okada
  • Atsuto Maki
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11132)

Abstract

This paper presents an automatic network adaptation method that finds a ConvNet structure well-suited to a given target task, e.g. image classification, for efficiency as well as accuracy in transfer learning. We call the concept target-aware transfer learning. Given only small-scale labeled data, and starting from an ImageNet pre-trained network, we exploit a scheme of removing its potential redundancy for the target task through iterative operations of filter-wise pruning and network optimization. The basic motivation is that compact networks are on one hand more efficient and should also be more tolerant, being less complex, against the risk of overfitting which would hinder the generalization of learned representations in the context of transfer learning. Further, unlike existing methods involving network simplification, we also let the scheme identify redundant portions across the entire network, which automatically results in a network structure adapted to the task at hand. We achieve this with a few novel ideas: (i) cumulative sum of activation statistics for each layer, and (ii) a priority evaluation of pruning across multiple layers. Experimental results by the method on five datasets (Flower102, CUB200-2011, Dog120, MIT67, and Stanford40) show favorable accuracies over the related state-of-the-art techniques while enhancing the computational and storage efficiency of the transferred model.

Keywords

Target-aware Network Adaptation Model compaction Transfer learning 

Notes

Acknowledgements

We acknowledge fruitful discussions and comments to this work from colleagues of Toshiba Corporate Research and Development Center. We thank NVIDIA Corporation for their generous donation of NVIDIA GPUs.

References

  1. 1.
    Ahonen, T., Hadid, A., Pietikäinen, M.: Face recognition with local binary patterns. In: Pajdla, T., Matas, J. (eds.) ECCV 2004, Part I. LNCS, vol. 3021, pp. 469–481. Springer, Heidelberg (2004).  https://doi.org/10.1007/978-3-540-24670-1_36CrossRefGoogle Scholar
  2. 2.
    Alvarez, J.M., Salzmann, M.: Learning the number of neurons in deep networks. Adv. Neural Inf. Process. Syst. 29, 2270–2278 (2016)Google Scholar
  3. 3.
    Azizpour, H., Razavian, A., Sullivan, J., Maki, A., Carlsson, S.: Factors of transferability for a generic convnet representation. IEEE Trans. Patt. Anal. Mach. Intell. 38(9), 1790–1802 (2016)CrossRefGoogle Scholar
  4. 4.
    Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part I. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006).  https://doi.org/10.1007/11744023_32CrossRefGoogle Scholar
  5. 5.
    Chen, D., Cao, X., Wen, F., Sun, J.: Blessing of dimensionality: high-dimensional feature and its efficient compression for face verification. In: 2013 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3025–3032 (2013).  https://doi.org/10.1109/CVPR.2013.389
  6. 6.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2009Google Scholar
  7. 7.
    Donahue, J., et al.: DeCAF: a deep convolutional activation feature for generic visual recognition. In: Proceedings of the 31st International Conference on Machine Learning, vol. 32, pp. 647–655 (2014)Google Scholar
  8. 8.
    Ge, W., Yu, Y.: Borrowing treasures from the wealthy: deep transfer learning through selective joint fine-tuning. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017Google Scholar
  9. 9.
    Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural network with pruning, trained quantization and Huffman coding. CoRR abs/1510.00149 (2015). http://arxiv.org/abs/1510.00149
  10. 10.
    Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. Adv. Neural Inf. Process. Syst. 28, 1135–1143 (2015)Google Scholar
  11. 11.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016Google Scholar
  12. 12.
    Hinton, G.E., Vinyals, O., Dean, J.: Distilling the knowledge in a neural networkGoogle Scholar
  13. 13.
    Kendall, A., Grimes, M., Cipolla, R.: PoseNet: a convolutional network for real-time 6-DOF camera relocalization. In: Proceedings of the International Conference on Computer Vision (ICCV) (2015)Google Scholar
  14. 14.
    Khosla, A., Jayadevaprakash, N., Yao, B., Fei-Fei, L.: Novel dataset for fine-grained image categorization. In: IEEE Conference on Computer Vision and Pattern Recognition First Workshop on Fine-Grained Visual Categorization, June 2011Google Scholar
  15. 15.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Adv. Neural Inf. Process. Syst. 25, 1097–1105 (2012)Google Scholar
  16. 16.
    Li, H., Kadav, A., Durdanovic, I., Samet, H., Peter Graf, H.: Pruning filters for efficient ConvNets. In: Proceedings of the International Conference on Learning Representations (ICLR), April 2017Google Scholar
  17. 17.
    Li, Z., Hoiem, D.: Learning without forgetting. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016, Part IV. LNCS, vol. 9908, pp. 614–629. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_37CrossRefGoogle Scholar
  18. 18.
    Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part V. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10602-1_48CrossRefGoogle Scholar
  19. 19.
    Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: The IEEE International Conference on Computer Vision (ICCV), pp. 3730–3738 (2015)Google Scholar
  20. 20.
    Long, M., Cao, Y., Wang, J., Jordan, M.: Learning transferable features with deep adaptation networks. In: Proceedings of the 32nd International Conference on Machine Learning, vol. 37, pp. 97–105 (2015)Google Scholar
  21. 21.
    Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157 (1999)Google Scholar
  22. 22.
    Luo, J., Wu, J., Lin, W.: ThiNet: a filter level pruning method for deep neural network compression. CoRR abs/1707.06342 (2017). http://arxiv.org/abs/1707.06342
  23. 23.
    Masana, M., van de Weijer, J., Herranz, L., Bagdanov, A.D., Alvarez, J.M.: Domain-adaptive deep network compression. In: The IEEE International Conference on Computer Vision (ICCV), October 2017Google Scholar
  24. 24.
    ModelZoo: Pretrained VGG-16 model for TensorFlow. http://download.tensorflow.org/models/vgg_16_2016_08_28.tar.gz. Accessed Aug 2017
  25. 25.
    Molchanov, P., Tyree, S., Karras, T., Aila, T., Kautz, J.: Pruning convolutional neural networks for resource efficient inference. In: Proceedings of the International Conference on Learning Representations (ICLR), April 2017Google Scholar
  26. 26.
    Nilsback, M.E., Zisserman, A.: Automated flower classification over a large number of classes. In: Proceedings of the Indian Conference on Computer Vision, Graphics and Image Processing, December 2008Google Scholar
  27. 27.
    Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Learning and transferring mid-level image representations using convolutional neural networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014)Google Scholar
  28. 28.
    Quattoni, A., Torralba, A.: Recognizing indoor scenes. In: 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops), pp. 413–420 (2009)Google Scholar
  29. 29.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)Google Scholar
  30. 30.
    Szegedy, C., et al.: Going deeper with convolutions. In: The IEEE International Conference on Computer Vision (ICCV) (2014)Google Scholar
  31. 31.
    Tzeng, E., Hoffman, J., Darrell, T., Saenko, K.: Simultaneous deep transfer across domains and tasks. In: The IEEE International Conference on Computer Vision (ICCV), December 2015Google Scholar
  32. 32.
    Wah, C., Branson, S., Welinder, P., Perona, P., Belongie, S.: The Caltech-UCSD Birds-200-2011 Dataset. Technical report (2011)Google Scholar
  33. 33.
    Wen, W., Wu, C., Wang, Y., Chen, Y., Li, H.: Learning structured sparsity in deep neural networks. Adv. Neural Inf. Process. Syst. 29, 2074–2082 (2016)Google Scholar
  34. 34.
    Yao, B., Jiang, X., Khosla, A., Lin, A.L., Guibas, L., Fei-Fei, L.: Human action recognition by learning bases of action attributes and parts. In: The IEEE International Conference on Computer Vision (ICCV), pp. 1331–1338 (2011)Google Scholar
  35. 35.
    Yosinski, J., Clune, J., Bengio, Y., Lipson, H.: How transferable are features in deep neural networks? Adv. Neural Inf. Process. Syst. 27, 3320–3328 (2014)Google Scholar
  36. 36.
    Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part I. LNCS, vol. 8689, pp. 818–833. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10590-1_53CrossRefGoogle Scholar
  37. 37.
    Zhong, Y., Sullivan, J., Li, H.: Transferring from face recognition to face attribute prediction through adaptive selection of off-the-shelf CNN representations. In: 23rd International Conference on Pattern Recognition (ICPR), pp. 2264–2269 (2016)Google Scholar
  38. 38.
    Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. Adv. Neural Inf. Process. Syst. 27, 487–495 (2014)Google Scholar
  39. 39.
    Zhou, H., Alvarez, J.M., Porikli, F.: Less is more: towards compact CNNs. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016, Part IV. LNCS, vol. 9908, pp. 662–677. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_40CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Yang Zhong
    • 1
    Email author
  • Vladimir Li
    • 1
  • Ryuzo Okada
    • 2
  • Atsuto Maki
    • 1
  1. 1.KTH Royal Institute of TechnologyStockholmSweden
  2. 2.Toshiba Corporate Research and Development CenterKawasakiJapan

Personalised recommendations