Advertisement

Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Deep Learning Inference with Dynamic Graphs on Heterogeneous Platforms

Abstract

One major drawback of deep-learning algorithms is the elevated cost of computing complexity and memory bandwidth required for inference. In order to ameliorate these costs in applications that utilize Convolutional Neural Networks (CNNs), a new, radical, approach is the dynamic pruning of kernels which aims to the parsimonious inference by learning to exploit and dynamically remove the redundant capacity of a CNN architecture. This conditional execution approach formulates a systematic and data-driven method for developing CNNs that are trained to eventually change size and form in real-time during inference, targeting to the smaller possible computational footprint. The conditional execution however, induces a number of challenges when it comes to the implementation of these algorithms to embedded systems. In this paper we present a systematic way of deploying this new dynamic pruning methodology, in heterogeneous platforms that facilitate both CPU and GPU subsystems. Realtime measurements of embedded implementations in modern SoCs verify the efficacy of the proposed methodology and demonstrate the ability of the dynamic networks to both adapt their size to the complexity of the task and deliver significant computational gains during inference.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

References

  1. 1.

    Russakovsky, O., Deng, J., Su, H., et al.: ImageNet large scale visual recognition challenge. Int. J. Comput. Vis. 115, 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y

  2. 2.

    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), pp. 1097–1105 (2012)

  3. 3.

    Ignatov, A., Timofte, R., Chou, W., et al.: AI Benchmark: running deep neural networks on android smartphones. https://arxiv.org/abs/1810.01109 (2018). Last Revised 15 Oct 2018

  4. 4.

    Knoblauch, A., Körner, E., Körner, U., Sommer, F.T.: Structural synaptic plasticity has high memory capacity and can explain graded amnesia, catastrophic forgetting, and the spacing effect. PLoS ONE 9(5), e96485 (2014). https://doi.org/10.1371/journal.pone.0096485

  5. 5.

    Bengio, E., Bacon, P.L., Pineau, J., Precup, D.: Conditional computation in neural networks for faster models. https://arxiv.org/abs/1511.06297 (2015). Last Revised 7 Jan 2016

  6. 6.

    Theodorakopoulos, I., Pothos, V., Kastaniotis, D., Fragoulis, N.: Parsimonious inference on convolutional neural networks: learning and applying on-line kernel activation rules. https://arxiv.org/abs/1701.05221 (2017). Last Revised 31 Jan 2017

  7. 7.

    Hu, H., Peng, R., Tai, Y.-W., Tang, C.-K.: Network trimming: a data-driven neuron pruning approach towards efficient deep architectures. https://arxiv.org/abs/1607.03250 (2016). Submitted on 12 Jul 2016

  8. 8.

    Feng, J., Darrell, T.: Learning the structure of deep convolutional networks. IEEE Int. Conf. Comput. Vis. (ICCV) 2749–2757, 1135–1143 (2015)

  9. 9.

    Wen, W., Wu, C., Wang, Y., Chen, Y., Li, H.: Learning structured sparsity in deep neural networks. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), pp. 2074–2082 (2016)

  10. 10.

    Yang, T.J., Yu-Hsin, C., Vivienne, S.: Designing energy-efficient convolutional neural networks using energy-aware pruning. In: Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6071–6079 (2016)

  11. 11.

    Han, S., Pool, J., Tran, J., Dally, W.J.: Learning both weights and connections for efficient neural networks. In: Proceedings of Advances in Neural Information Processing Systems (NIPS), pp. 1135–1143 (2015)

  12. 12.

    Figurnov, M., Vetrov, D., Kohl, P.: PerforatedCNNs: acceleration through elimination of redundant convolutions, https://arxiv.org/pdf/1504.08362 (2015). Last Revised 16 Oct 2016

  13. 13.

    Bossard, L., Guillaumin, M., Van Gool, L.: Food-101—mining discriminative components with random forests. In: Fleet D., Pajdla T., Schiele B., Tuytelaars T. (eds.) Computer Vision—ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol. 8694. Springer, Berlin (2014)

  14. 14.

    Forrest, N.I., Song, H., et al.: SqueezeNet: AlexNet-level accuracy with 50 × fewer parameters and < 1 MB model size. https://arxiv.org/abs/1602.07360 (2016). Last Revised 4 Nov 2016

  15. 15.

    Qualcomm Neural Processing SDK for AI (https://developer.qualcomm.com/software/qualcomm-neural-processing-sdk). Accessed 2019

  16. 16.

    Irida Labs S.A. (https://www.iridalabs.gr). Accessed 2020

Download references

Funding

This work has received funding from the European Union’s Horizon 2020 Research and Innovation Programme under grant agreement No. 780788.

Author information

Correspondence to N. Fragoulis.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Pothos, V., Vassalos, E., Theodorakopoulos, I. et al. Deep Learning Inference with Dynamic Graphs on Heterogeneous Platforms. Int J Parallel Prog (2020). https://doi.org/10.1007/s10766-020-00654-2

Download citation

Keywords

  • Deep learning
  • Convolutional neural networks
  • Heterogeneous platforms
  • Conditional execution
  • Dynamic pruning