Sparsity in Deep Neural Networks - An Empirical Investigation with TensorQuant

  • Dominik Marek LorochEmail author
  • Franz-Josef Pfreundt
  • Norbert Wehn
  • Janis Keuper
Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 967)


Deep learning is finding its way into the embedded world with applications such as autonomous driving, smart sensors and augmented reality. However, the computation of deep neural networks is demanding in energy, compute power and memory. Various approaches have been investigated to reduce the necessary resources, one of which is to leverage the sparsity occurring in deep neural networks due to the high levels of redundancy in the network parameters. It has been shown that sparsity can be promoted specifically and the achieved sparsity can be very high. But in many cases the methods are evaluated on rather small topologies. It is not clear if the results transfer onto deeper topologies.

In this paper, the TensorQuant toolbox has been extended to offer a platform to investigate sparsity, especially in deeper models. Several practical relevant topologies for varying classification problem sizes are investigated to show the differences in sparsity for activations, weights and gradients.


Deep neural networks Sparsity Toolbox 


  1. 1.
    Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding (2015)Google Scholar
  2. 2.
    Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and \({<}0.5\,\text{MB}\) model size (2016)Google Scholar
  3. 3.
    Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications (2017)Google Scholar
  4. 4.
    Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices (2017)Google Scholar
  5. 5.
    Zhu, J., Jiang, J., Chen, X., Tsui, C.Y.: SparseNN: an energy-efficient neural network accelerator exploiting input and output sparsity. CoRR, abs/1711.01263 (2017)Google Scholar
  6. 6.
    Han, S., et al.: EIE: efficient inference engine on compressed deep neural network. In: ISCA, pp. 243–254. IEEE Computer Society (2016)Google Scholar
  7. 7.
    Aimar, A., et al.: NullHop: a flexible convolutional neural network accelerator based on sparse representations of feature maps (2017)Google Scholar
  8. 8.
    Andri, R., Cavigelli, L., Rossi, D., Benini, L.: YodaNN: an architecture for ultra-low power binary-weight CNN acceleration (2016)Google Scholar
  9. 9.
    Rybalkin, V., Wehn, N., Yousefi, M.R., Stricker, D.: Hardware architecture of bidirectional long short-term memory neural network for optical character recognition. In: Proceedings of the Conference on Design, Automation and Test in Europe, pp. 1394–1399. European Design and Automation Association (2017)Google Scholar
  10. 10.
    Chang, A.X.M., Zaidy, A., Gokhale, V., Culurciello, E.: Compiling deep learning models for custom hardware accelerators (2017)Google Scholar
  11. 11.
    You, Y., Gitman, I., Ginsburg, B.: Large batch training of convolutional networks (2017)Google Scholar
  12. 12.
    Keuper, J., Pfreundt, F.J.: Distributed training of deep neural networks: theoretical and practical limits of parallel scalability (2016)Google Scholar
  13. 13.
    Kuehn, M., Keuper, J., Pfreundt, F.J.: Using GPI-2 for distributed memory paralleliziation of the Caffe toolbox to speed up deep neural network training (2017)Google Scholar
  14. 14.
    Renggli, C., Alistarh, D., Hoefler, T.: SparCML: high-performance sparse communication for machine learning (2018)Google Scholar
  15. 15.
    Aji, A.F., Heafield, K.: Sparse communication for distributed gradient descent (2017)Google Scholar
  16. 16.
    Wangni, J., Wang, J., Liu, J., Zhang, T.: Gradient sparsification for communication-efficient distributed optimization (2017)Google Scholar
  17. 17.
    Rhu, M., O’Connor, M., Chatterjee, N., Pool, J., Kwon, Y., Keckler, S.W.: Compressing DMA engine: leveraging activation sparsity for training deep neural networks. In: 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 78–91. IEEE (2018)Google Scholar
  18. 18.
    Lin, Y., Han, S., Mao, H., Wang, Y., Dally, W.J.: Deep gradient compression: reducing the communication bandwidth for distributed training (2017)Google Scholar
  19. 19.
    Loroch, D.M., Pfreundt, F.J., Wehn, N., Keuper, J.: TensorQuant: a simulation toolbox for deep neural network quantization. In: MLHPC@SC, pp. 1:1–1:8. ACM (2017)Google Scholar
  20. 20.
    Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. OSDI 16, 265–283 (2016)Google Scholar
  21. 21.
    Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Rev. 60(2), 223–311 (2016)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Sun, X., Ren, X., Ma, S., Wang, H.: meProp: sparsified back propagation for accelerated deep learning with reduced overfitting. CoRR, abs/1706.06197 (2017)Google Scholar
  23. 23.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  24. 24.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015)Google Scholar
  25. 25.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 248–255. IEEE (2009)Google Scholar
  26. 26.
    Krizhevsky, A.: Learning multiple layers of features from tiny images, May 2012Google Scholar
  27. 27.
    Krizhevsky, A., Nair, V., Hinton, G.: CIFAR-100 (Canadian institute for advanced research) (2009)Google Scholar
  28. 28.
    LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  29. 29.
    Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)MathSciNetzbMATHGoogle Scholar
  30. 30.
    Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. CoRR, abs/1502.03167 (2015)Google Scholar
  31. 31.
    Wen, W., Wu, C., Wang, Y., Chen, Y., Li, H.: Learning structured sparsity in deep neural networks (2016)Google Scholar
  32. 32.
    Liu, X., Pool, J., Han, S., Dally, W.J.: Efficient sparse-Winograd convolutional neural networks. CoRR, abs/1802.06367 (2018)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Dominik Marek Loroch
    • 1
    • 2
    Email author
  • Franz-Josef Pfreundt
    • 1
  • Norbert Wehn
    • 2
  • Janis Keuper
    • 1
    • 3
  1. 1.Fraunhofer ITWMKaiserslauternGermany
  2. 2.TU KaiserslauternKaiserslauternGermany
  3. 3.Fraunhofer Center Machine LearningSt. AugustinGermany

Personalised recommendations