Accuracy to Throughput Trade-Offs for Reduced Precision Neural Networks on Reconfigurable Logic

Su, Jiang; Fraser, Nicholas J.; Gambardella, Giulio; Blott, Michaela; Durelli, Gianluca; Thomas, David B.; Leong, Philip H. W.; Cheung, Peter Y. K.

doi:10.1007/978-3-319-78890-6_3

Jiang Su^19,20,
Nicholas J. Fraser^19,20,
Giulio Gambardella^19,20,
Michaela Blott^19,20,
Gianluca Durelli^19,20,
David B. Thomas^19,20,
Philip H. W. Leong^19,20 &
…
Peter Y. K. Cheung^19,20

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10824))

Included in the following conference series:

International Symposium on Applied Reconfigurable Computing

2541 Accesses
12 Citations

Abstract

Modern Convolutional Neural Networks (CNNs) are typically based on floating point linear algebra based implementations. Recently, reduced precision Neural Networks (NNs) have been gaining popularity as they require significantly less memory and computational resources compared to floating point. This is particularly important in power constrained compute environments. However, in many cases a reduction in precision comes at a small cost to the accuracy of the resultant network. In this work, we investigate the accuracy-throughput trade-off for various parameter precision applied to different types of NN models. We firstly propose a quantization training strategy that allows reduced precision NN inference with a lower memory footprint and competitive model accuracy. Then, we quantitatively formulate the relationship between data representation and hardware efficiency. Our experiments finally provide insightful observation. For example, one of our tests show 32-bit floating point is more hardware efficient than 1-bit parameters to achieve 99% MNIST accuracy. In general, 2-bit and 4-bit fixed point parameters show better hardware trade-off on small-scale datasets like MNIST and CIFAR-10 while 4-bit provide the best trade-off in large-scale tasks like AlexNet on ImageNet dataset within our tested problem domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This assumes that both networks have the same memory footprint for their parameters.
2.
The reason that the particular 784–1000 \(\times \) 3–10 structure is selected in Table 2 is that it is the only structure that is reported in all mentioned works. We compare different methods on the same structure in the same classification task for fair comparisons on memory and accuracy.

References

Umuroglu, Y., Fraser, N.J., Gambardella, G., Blott, M., Leong, P.H.W., Jahre, M., Vissers, K.A.: FINN: a framework for fast, scalable binarized neural network inference, CoRR (2016)
Google Scholar
Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or \(-\)1, CoRR abs/1602.02830 (2016)
Google Scholar
Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: imagenet classification using binary convolutional neural networks. In: ECCV (2016)
Google Scholar
Sung, W., Shin, S., Hwang, K.: Resiliency of deep neural networks under quantization, CoRR abs/1511.06488 (2015)
Google Scholar
Zhou, S., Ni, Z., Zhou, X., Wen, H., Wu, Y., Zou, Y.: DoReFa-Net: training low bitwidth convolutional neural networks with low bitwidth gradients, CoRR abs/1606.06160 (2016)
Google Scholar
Denil, M., Shakibi, B., Dinh, L., Ranzato, M., de Freitas, N.: Predicting parameters in deep learning, CoRR abs/1306.0543 (2013)
Google Scholar
Hwang, K., Sung, W.: Fixed-point feedforward deep neural network design using weights +1, 0, and \(-\)1. In: Proceedings of IEEE ICASSP, pp. 1–6. IEEE (2014)
Google Scholar
Wu, J., Leng, C., Wang, Y., Hu, Q., Cheng, J.: Quantized convolutional neural networks for mobile devices, CoRR abs/1512.06473 (2015)
Google Scholar
Courbariaux, M., Bengio, Y., David, J.: Low precision arithmetic for deep learning, CoRR abs/1412.7024 (2014)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization, CoRR abs/1412.6980 (2014)
Google Scholar
Glorot, X., Bengio, Y.: Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of AISTATS 2010 (2010)
Google Scholar
Fraser, N.J., et al.: Scaling binarized neural networks on reconfigurable logic, CoRR abs/1701.03400 (2017)
Google Scholar
Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images, Technical report (2009)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of NIPS, pp. 1097–1105 (2012)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition, CoRR abs/1409.1556 (2014)
Google Scholar
Ciresan, D.C., Meier, U., Masci, J., Gambardella, L.M., Schmidhuber, J.: High-performance neural networks for visual object classification, CoRR abs/1102.0183 (2011)
Google Scholar
Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network abs/1503.02531 (2015)
Google Scholar
Chen, W., Wilson, J.T., Tyree, S., Weinberger, K.Q., Chen, Y.: Compressing neural networks with the hashing trick, CoRR abs/1504.04788 (2015)
Google Scholar
Cai, Z., He, X., Sun, J., Vasconcelos, N.: Deep learning with low precision by half-wave Gaussian quantization, CoRR abs/1702.00953 (2017)
Google Scholar

Download references

Acknowledgments

The authors from Imperial College London would like to acknowledge the support of UK’s research council (RCUK) with the following grants: EP/K034448, P010040 and N031768. The authors from The University of Sydney acknowledge support from the Australian Research Council Linkage Project LP130101034.

Author information

Authors and Affiliations

Xilinx Research Labs, Imperial College London, London, UK
Jiang Su, Nicholas J. Fraser, Giulio Gambardella, Michaela Blott, Gianluca Durelli, David B. Thomas, Philip H. W. Leong & Peter Y. K. Cheung
University of Sydney, Sydney, Australia
Jiang Su, Nicholas J. Fraser, Giulio Gambardella, Michaela Blott, Gianluca Durelli, David B. Thomas, Philip H. W. Leong & Peter Y. K. Cheung

Authors

Jiang Su
View author publications
You can also search for this author in PubMed Google Scholar
Nicholas J. Fraser
View author publications
You can also search for this author in PubMed Google Scholar
Giulio Gambardella
View author publications
You can also search for this author in PubMed Google Scholar
Michaela Blott
View author publications
You can also search for this author in PubMed Google Scholar
Gianluca Durelli
View author publications
You can also search for this author in PubMed Google Scholar
David B. Thomas
View author publications
You can also search for this author in PubMed Google Scholar
Philip H. W. Leong
View author publications
You can also search for this author in PubMed Google Scholar
Peter Y. K. Cheung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiang Su .

Editor information

Editors and Affiliations

Technological Educational Institute of Western Greece, Antirrio, Greece
Nikolaos Voros
Ruhr-Universität Bochum, Bochum, Germany
Michael Huebner
Technological Educational Institute of Western Greece, Antirrio, Greece
Georgios Keramidas
Technische Universität Dresden, Dresden, Germany
Diana Goehringer
Technological Educational Institute of Western Greece, Antirio, Greece
Christos Antonopoulos
INESC-ID, Lisbon, Portugal
Pedro C. Diniz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Su, J. et al. (2018). Accuracy to Throughput Trade-Offs for Reduced Precision Neural Networks on Reconfigurable Logic. In: Voros, N., Huebner, M., Keramidas, G., Goehringer, D., Antonopoulos, C., Diniz, P. (eds) Applied Reconfigurable Computing. Architectures, Tools, and Applications. ARC 2018. Lecture Notes in Computer Science(), vol 10824. Springer, Cham. https://doi.org/10.1007/978-3-319-78890-6_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-78890-6_3
Published: 08 April 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-78889-0
Online ISBN: 978-3-319-78890-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Accuracy to Throughput Trade-Offs for Reduced Precision Neural Networks on Reconfigurable Logic