Skip to main content

Accuracy to Throughput Trade-Offs for Reduced Precision Neural Networks on Reconfigurable Logic

  • Conference paper
  • First Online:
Applied Reconfigurable Computing. Architectures, Tools, and Applications (ARC 2018)

Abstract

Modern Convolutional Neural Networks (CNNs) are typically based on floating point linear algebra based implementations. Recently, reduced precision Neural Networks (NNs) have been gaining popularity as they require significantly less memory and computational resources compared to floating point. This is particularly important in power constrained compute environments. However, in many cases a reduction in precision comes at a small cost to the accuracy of the resultant network. In this work, we investigate the accuracy-throughput trade-off for various parameter precision applied to different types of NN models. We firstly propose a quantization training strategy that allows reduced precision NN inference with a lower memory footprint and competitive model accuracy. Then, we quantitatively formulate the relationship between data representation and hardware efficiency. Our experiments finally provide insightful observation. For example, one of our tests show 32-bit floating point is more hardware efficient than 1-bit parameters to achieve 99% MNIST accuracy. In general, 2-bit and 4-bit fixed point parameters show better hardware trade-off on small-scale datasets like MNIST and CIFAR-10 while 4-bit provide the best trade-off in large-scale tasks like AlexNet on ImageNet dataset within our tested problem domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This assumes that both networks have the same memory footprint for their parameters.

  2. 2.

    The reason that the particular 784–1000 \(\times \) 3–10 structure is selected in Table 2 is that it is the only structure that is reported in all mentioned works. We compare different methods on the same structure in the same classification task for fair comparisons on memory and accuracy.

References

  1. Umuroglu, Y., Fraser, N.J., Gambardella, G., Blott, M., Leong, P.H.W., Jahre, M., Vissers, K.A.: FINN: a framework for fast, scalable binarized neural network inference, CoRR (2016)

    Google Scholar 

  2. Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or \(-\)1, CoRR abs/1602.02830 (2016)

    Google Scholar 

  3. Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: imagenet classification using binary convolutional neural networks. In: ECCV (2016)

    Google Scholar 

  4. Sung, W., Shin, S., Hwang, K.: Resiliency of deep neural networks under quantization, CoRR abs/1511.06488 (2015)

    Google Scholar 

  5. Zhou, S., Ni, Z., Zhou, X., Wen, H., Wu, Y., Zou, Y.: DoReFa-Net: training low bitwidth convolutional neural networks with low bitwidth gradients, CoRR abs/1606.06160 (2016)

    Google Scholar 

  6. Denil, M., Shakibi, B., Dinh, L., Ranzato, M., de Freitas, N.: Predicting parameters in deep learning, CoRR abs/1306.0543 (2013)

    Google Scholar 

  7. Hwang, K., Sung, W.: Fixed-point feedforward deep neural network design using weights +1, 0, and \(-\)1. In: Proceedings of IEEE ICASSP, pp. 1–6. IEEE (2014)

    Google Scholar 

  8. Wu, J., Leng, C., Wang, Y., Hu, Q., Cheng, J.: Quantized convolutional neural networks for mobile devices, CoRR abs/1512.06473 (2015)

    Google Scholar 

  9. Courbariaux, M., Bengio, Y., David, J.: Low precision arithmetic for deep learning, CoRR abs/1412.7024 (2014)

    Google Scholar 

  10. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization, CoRR abs/1412.6980 (2014)

    Google Scholar 

  11. Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of AISTATS 2010 (2010)

    Google Scholar 

  12. Fraser, N.J., et al.: Scaling binarized neural networks on reconfigurable logic, CoRR abs/1701.03400 (2017)

    Google Scholar 

  13. Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images, Technical report (2009)

    Google Scholar 

  14. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of NIPS, pp. 1097–1105 (2012)

    Google Scholar 

  15. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition, CoRR abs/1409.1556 (2014)

    Google Scholar 

  16. Ciresan, D.C., Meier, U., Masci, J., Gambardella, L.M., Schmidhuber, J.: High-performance neural networks for visual object classification, CoRR abs/1102.0183 (2011)

    Google Scholar 

  17. Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network abs/1503.02531 (2015)

    Google Scholar 

  18. Chen, W., Wilson, J.T., Tyree, S., Weinberger, K.Q., Chen, Y.: Compressing neural networks with the hashing trick, CoRR abs/1504.04788 (2015)

    Google Scholar 

  19. Cai, Z., He, X., Sun, J., Vasconcelos, N.: Deep learning with low precision by half-wave Gaussian quantization, CoRR abs/1702.00953 (2017)

    Google Scholar 

Download references

Acknowledgments

The authors from Imperial College London would like to acknowledge the support of UK’s research council (RCUK) with the following grants: EP/K034448, P010040 and N031768. The authors from The University of Sydney acknowledge support from the Australian Research Council Linkage Project LP130101034.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiang Su .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Su, J. et al. (2018). Accuracy to Throughput Trade-Offs for Reduced Precision Neural Networks on Reconfigurable Logic. In: Voros, N., Huebner, M., Keramidas, G., Goehringer, D., Antonopoulos, C., Diniz, P. (eds) Applied Reconfigurable Computing. Architectures, Tools, and Applications. ARC 2018. Lecture Notes in Computer Science(), vol 10824. Springer, Cham. https://doi.org/10.1007/978-3-319-78890-6_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-78890-6_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-78889-0

  • Online ISBN: 978-3-319-78890-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics