Advertisement

DNN Feature Map Compression Using Learned Representation over GF(2)

  • Denis GudovskiyEmail author
  • Alec Hodgkinson
  • Luca Rigazio
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11132)

Abstract

In this paper, we introduce a method to compress intermediate feature maps of deep neural networks (DNNs) to decrease memory storage and bandwidth requirements during inference. Unlike previous works, the proposed method is based on converting fixed-point activations into vectors over the smallest GF(2) finite field followed by nonlinear dimensionality reduction (NDR) layers embedded into a DNN. Such an end-to-end learned representation finds more compact feature maps by exploiting quantization redundancies within the fixed-point activations along the channel or spatial dimensions. We apply the proposed network architectures derived from modified SqueezeNet and MobileNetV2 to the tasks of ImageNet classification and PASCAL VOC object detection. Compared to prior approaches, the conducted experiments show a factor of 2 decrease in memory requirements with minor degradation in accuracy while adding only bitwise computations.

Keywords

Feature map compression Dimensionality reduction Network quantization Memory-efficient inference 

References

  1. 1.
    Alwani, M., Chen, H., Ferdman, M., Milder, P.A.: Fused-layer CNN accelerators. In: MICRO, pp. 1–12, October 2016Google Scholar
  2. 2.
    Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013)
  3. 3.
    Courbariaux, M., Bengio, Y., David, J.: Training deep neural networks with low precision multiplications. In: ICLR, May 2015Google Scholar
  4. 4.
    Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. IJCV 88(2), 303–338 (2010)CrossRefGoogle Scholar
  5. 5.
    Graham, B., van der Maaten, L.: Submanifold sparse convolutional networks. arXiv preprint arXiv:1706.01307 (2017)
  6. 6.
    Gysel, P., Motamedi, M., Ghiasi, S.: Hardware-oriented approximation of convolutional neural networks. In: ICLR, May 2016Google Scholar
  7. 7.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. arXiv preprint arXiv:1512.03385 (2015)
  8. 8.
    Hinton, G., Salakhutdinov, R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14, February 2014Google Scholar
  10. 10.
    Huang, J., et al.: Speed/accuracy trade-offs for modern convolutional object detectors. In: CVPR, July 2017Google Scholar
  11. 11.
    Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks. In: NIPS, pp. 4107–4115 (2016)Google Scholar
  12. 12.
    Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., Keutzer, K.: Squeezenet: alexnet-level accuracy with 50x fewer parameters and \(<\)0.5MB model size. arXiv preprint arXiv:1602.07360 (2016)
  13. 13.
    Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093 (2014)
  14. 14.
    Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46448-0_2CrossRefGoogle Scholar
  15. 15.
    Miyashita, D., Lee, E.H., Murmann, B.: Convolutional neural networks using logarithmic data representation. arXiv preprint arXiv:1603.01025 (2016)
  16. 16.
    O’Connor, P., Welling, M.: Sigma delta quantized networks. In: ICLR, April 2017Google Scholar
  17. 17.
    Parashar, A., et al.: SCNN: an accelerator for compressed-sparse convolutional neural networks. In: ISCA, pp. 27–40 (2017)Google Scholar
  18. 18.
    Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: imagenet classification using binary convolutional neural networks. arXiv preprint arXiv:1603.05279 (2016)
  19. 19.
    Russakovsky, O., et al.: ImageNet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)MathSciNetCrossRefGoogle Scholar
  20. 20.
    Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetv 2: inverted residuals and linear bottlenecks. In: CVPR, June 2018Google Scholar
  21. 21.
    Tang, W., Hua, G., Wang, L.: How to train a compact binary neural network with high accuracy? In: AAAI (2017)Google Scholar
  22. 22.
    Zhou, S., Wu, Y., Ni, Z., Zhou, X., Wen, H., Zou, Y.: DoReFa-Net: training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv preprint arXiv:1606.06160 (2016)

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Panasonic Beta Research LabMountain ViewUSA

Personalised recommendations