Advertisement

Convolutional Networks with Adaptive Inference Graphs

  • Andreas VeitEmail author
  • Serge Belongie
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11205)

Abstract

Do convolutional networks really need a fixed feed-forward structure? What if, after identifying the high-level concept of an image, a network could move directly to a layer that can distinguish fine-grained differences? Currently, a network would first need to execute sometimes hundreds of intermediate layers that specialize in unrelated aspects. Ideally, the more a network already knows about an image, the better it should be at deciding which layer to compute next. In this work, we propose convolutional networks with adaptive inference graphs (ConvNet-AIG) that adaptively define their network topology conditioned on the input image. Following a high-level structure similar to residual networks (ResNets), ConvNet-AIG decides for each input image on the fly which layers are needed. In experiments on ImageNet we show that ConvNet-AIG learns distinct inference graphs for different categories. Both ConvNet-AIG with 50 and 101 layers outperform their ResNet counterpart, while using \(20\%\) and \(33\%\) less computations respectively. By grouping parameters into layers for related classes and only executing relevant layers, ConvNet-AIG improves both efficiency and overall classification quality. Lastly, we also study the effect of adaptive inference graphs on the susceptibility towards adversarial examples. We observe that ConvNet-AIG shows a higher robustness than ResNets, complementing other known defense mechanisms.

Notes

Acknowledgements

We would like to thank Ilya Kostrikov, Daniel D. Lee, Kimberly Wilber, Antonio Marcedone and Yiqing Hua for insightful discussions and feedback. This work was supported in part by the Oath Laboratory for Connected Experiences, a Google Focused Research Award, AWS Cloud Credits for Research and a Facebook equipment donation.

References

  1. 1.
    Andreas, J., Rohrbach, M., Darrell, T., Klein, D.: Learning to compose neural networks for question answering. In: Proceedings of NAACL-HLT (2016)Google Scholar
  2. 2.
    Andreas, J., Rohrbach, M., Darrell, T., Klein, D.: Neural module networks. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  3. 3.
    Bengio, E., Bacon, P.L., Pineau, J., Precup, D.: Conditional computation in neural networks for faster models. arXiv preprint arXiv:1511.06297 (2015)
  4. 4.
    Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013)
  5. 5.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2009)Google Scholar
  6. 6.
    Figurnov, M., et al.: Spatially adaptive computation time for residual networks. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  7. 7.
    Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: International Conference on Artificial Intelligence and Statistics (AISTATS) (2011)Google Scholar
  8. 8.
    Goodfellow, I.J., Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014)
  9. 9.
    Gumbel, E.J.: Statistical theory of extreme values and some practical applications: a series of lectures. No. 33, US Govt. Print. Office (1954)Google Scholar
  10. 10.
    Guo, C., Rana, M., Cisse, M., van der Maaten, L.: Countering adversarial images using input transformations. arXiv preprint arXiv:1711.00117 (2017)
  11. 11.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar
  12. 12.
    He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 630–645. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_38CrossRefGoogle Scholar
  13. 13.
    Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. arXiv preprint arXiv:1709.01507 (2017)
  14. 14.
    Huang, G., Chen, D., Li, T., Wu, F., van der Maaten, L., Weinberger, K.Q.: Multi-scale dense convolutional networks for efficient prediction. arXiv preprint arXiv:1703.09844 (2017)
  15. 15.
    Huang, G., Liu, Z., Weinberger, K.Q., van der Maaten, L.: Densely connected convolutional networks. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  16. 16.
    Huang, G., Sun, Y., Liu, Z., Sedra, D., Weinberger, K.Q.: Deep networks with stochastic depth. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 646–661. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_39CrossRefGoogle Scholar
  17. 17.
    Huang, X., Belongie, S.: Arbitrary style transfer in real-time with adaptive instance normalization. In: International Conference on Computer Vision (ICCV) (2017)Google Scholar
  18. 18.
    Jang, E., Gu, S., Poole, B.: Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144 (2016)
  19. 19.
    Johnson, J., et al.: Inferring and executing programs for visual reasoning. In: International Conference on Computer Vision (ICCV) (2017)Google Scholar
  20. 20.
    Kingma, D.P., Welling, M.: Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013)
  21. 21.
    Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images (2009)Google Scholar
  22. 22.
    Li, H., Lin, Z., Shen, X., Brandt, J., Hua, G.: A convolutional neural network cascade for face detection. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2015)Google Scholar
  23. 23.
    Li, Y., Wang, N., Liu, J., Hou, X.: Demystifying neural style transfer. arXiv preprint arXiv:1701.01036 (2017)
  24. 24.
    Maddison, C.J., Mnih, A., Teh, Y.W.: The concrete distribution: A continuous relaxation of discrete random variables. arXiv preprint arXiv:1611.00712 (2016)
  25. 25.
    Misra, I., Gupta, A., Hebert, M.: From red wine to red tomato: Composition with context. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2017)Google Scholar
  26. 26.
    Shazeer, N., et al.: Outrageously large neural networks: The sparsely-gated mixture-of-experts layer. arXiv preprint arXiv:1701.06538 (2017)
  27. 27.
    Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. (JMLR) 15(1), 1929–1958 (2014)MathSciNetzbMATHGoogle Scholar
  28. 28.
    Srivastava, R.K., Greff, K., Schmidhuber, J.: Highway networks. arXiv preprint arXiv:1505.00387 (2015)
  29. 29.
    Szegedy, C., et al.: Going deeper with convolutions. In: Conference on Computer Vision and Pattern Recognition (CVPR)Google Scholar
  30. 30.
    Teerapittayanon, S., McDanel, B., Kung, H.: BranchyNet: fast inference via early exiting from deep neural networks. In: Conference onPattern Recognition (ICPR) (2016)Google Scholar
  31. 31.
    Veit, A., Wilber, M.J., Belongie, S.: Residual networks behave like ensembles of relatively shallow networks. In: Advances in Neural Information Processing Systems (NIPS) (2016)Google Scholar
  32. 32.
    Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comput. Vis. (IJCV) 57(2), 137–154 (2004)CrossRefGoogle Scholar
  33. 33.
    Yang, F., Choi, W., Lin, Y.: Exploit all the layers: fast and accurate CNN object detector with scale dependent pooling and cascaded rejection classifiers. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Department of Computer Science and Cornell TechCornell UniversityNew YorkUSA

Personalised recommendations