Hardware–Software Approximations for Deep Neural Networks

Hanif, Muhammad Abdullah; Javed, Muhammad Usama; Hafiz, Rehan; Rehman, Semeen; Shafique, Muhammad

doi:10.1007/978-3-319-99322-5_13

Muhammad Abdullah Hanif³,
Muhammad Usama Javed⁴,
Rehan Hafiz⁴,
Semeen Rehman³ &
…
Muhammad Shafique³

1679 Accesses
1 Citations

Abstract

Neural networks (NNs) are the state of the art for many artificial intelligence (AI) applications. However, in order to facilitate the training process, most of the neural networks are over-parameterized and result in significant computational and memory overheads. Therefore, to alleviate the computational and memory requirements of these NNs, numerous optimization techniques have been proposed. In this chapter, we highlight one of the prominent paradigms, i.e., approximate computing, that can significantly improve the resource requirements of these networks. We describe a sensitivity analysis methodology for estimating the significance sub-parts of the state-of-the-art NNs. Based upon the significance analysis, we then present a methodology for employing tolerable amount of approximations at various stages of the network, i.e., removal of ineffectual filters/neurons at the software layer and precision reduction and memory approximations at the hardware layer. Towards the end of this chapter, we also highlight few of the prominent challenges in adopting different types of approximation and the effects that they have on the overall efficiency and accuracy of the baseline networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Almurib HAF, Kumar TN, Lombardi F (2016) Inexact designs for approximate low power addition by cell replacement. In: 2016 design, automation test in Europe conference exhibition (DATE), pp 660–665
Google Scholar
Alorda B, Torrens G, Bota S, Segura J (2011) 8t vs. 6t sram cell radiation robustness: a comparative analysis. Microelectron Reliab 51(2):350–359
Article Google Scholar
Anwar S, Hwang K, Sung W (2017) Structured pruning of deep convolutional neural networks. ACM J Emerg Technol Comput Syst 13(3):32
Article Google Scholar
Chen W, Wilson J, Tyree S, Weinberger K, Chen Y (2015) Compressing neural networks with the hashing trick. In: International conference on machine learning, pp 2285–2294
Google Scholar
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition. CVPR 2009. IEEE, Piscataway, pp 248–255
Google Scholar
Dreslinski RG, Wieckowski M, Blaauw D, Sylvester D, Mudge T (2010) Near-threshold computing: reclaiming Moore’s law through energy efficient integrated circuits. Proc IEEE 98(2):253–266
Article Google Scholar
Gupta V, Mohapatra D, Park SP, Raghunathan A, Roy K (2011) Impact: imprecise adders for low-power approximate computing. In: Proceedings of the 17th IEEE/ACM international symposium on low-power electronics and design. IEEE Press, New York, pp 409–414
Chapter Google Scholar
Gupta S, Agrawal A, Gopalakrishnan K, Narayanan P (2015) Deep learning with limited numerical precision. In: Proceedings of the 32nd international conference on machine learning (ICML-15), pp 1737–1746
Google Scholar
Han S, Mao H, Dally WJ (2015) Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. arXiv:1510.00149
Google Scholar
Hanif MA, Hafiz R, Hasan O, Shafique M (2017) Quad: design and analysis of quality-area optimal low-latency approximate adders. In: 2017 54th ACM/EDAC/IEEE design automation conference (DAC), pp 1–6
Google Scholar
Hanif MA, Hafiz R, Shafique M (2018) Error resilience analysis for systematically employing approximate computing in convolutional neural networks. In: 2018 design, automation & test in Europe conference & exhibition (DATE), Dresden, pp. 913–916
Google Scholar
Hashemi S, Anthony N, Tann H, Bahar RI, Reda S (2017) Understanding the impact of precision quantization on the accuracy and energy of neural networks. In: 2017 design, automation & test in europe conference & exhibition (DATE). IEEE, Piscataway, pp 1474–1479
Chapter Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Google Scholar
Hegde R, Shanbhag NR (2001) Soft digital signal processing. IEEE Trans Very Large Scale Integr Syst 9(6):813–823
Article Google Scholar
Inacio C, Ombres D (1996) The DSP decision: fixed point or floating? IEEE Spectr 33(9):72–74
Article Google Scholar
Jiang H, Han J, Qiao F, Lombardi F (2016) Approximate radix-8 booth multipliers for low-power and high-performance operation. IEEE Trans Comput 65:2638–2644
Article MathSciNet Google Scholar
Jiang H, Liu C, Maheshwari N, Lombardi F, Han J (2016) A comparative evaluation of approximate multipliers. In: IEEE/ACM international symposium on nanoscale architectures (NANOARCH). IEEE, Piscataway, pp 191–196
Google Scholar
Jouppi NP, Young C, Patil N, Patterson D, Agrawal G, Bajwa R, Bates S, Bhatia S, Boden N, Borchers A et al (2017) In-datacenter performance analysis of a tensor processing unit. In: Proceedings of the 44th annual international symposium on computer architecture. ACM, New York, pp. 1–12
Google Scholar
Kahng AB, Kang S (2012) Accuracy-configurable adder for approximate arithmetic designs. In: Proceedings of the 49th annual design automation conference, pp 820–825
Google Scholar
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pp 1097–1105
Google Scholar
Leon-Garcia A (2007) Probability and random processes for EE’s, 3rd edn. Prentice-Hall, Upper Saddle River
Google Scholar
Li H, Kadav A, Durdanovic I, Samet H, Graf HP (2016) Pruning filters for efficient convnets. arXiv:1608.08710
Google Scholar
Liu C, Han J, Lombardi F (2014) A low-power, high-performance approximate multiplier with configurable partial error recovery. In: Proceedings of the conference on design, automation & test in Europe , 95 pp
Google Scholar
Mahdiani HR, Ahmadi A, Fakhraie SM, Lucas C (2010) Bio-inspired imprecise computational blocks for efficient VLSI implementation of soft-computing applications. IEEE Trans Circuits Syst I: Regul Pap 57(4):850–862
Article MathSciNet Google Scholar
Maheshwari N, Yang Z, Han J, Lombardi F (2015) A design approach for compressor based approximate multipliers. In: Proceedings of 28th international conference on VLSI design, pp 209–214
Google Scholar
Parashar A, Rhu M, Mukkara A, Puglielli A, Venkatesan R, Khailany B, Emer J, Keckler SW, Dally WJ (2017) SCNN: an accelerator for compressed-sparse convolutional neural networks. In: Proceedings of the 44th annual international symposium on computer architecture. ACM, New York, pp 27–40
Google Scholar
Qiu J, Wang J, Yao S, Guo K, Li B, Zhou E, Yu J, Tang T, Xu N, Song S et al (2016) Going deeper with embedded FPGA platform for convolutional neural network. In: Proceedings of the 2016 ACM/SIGDA international symposium on field-programmable gate arrays. ACM, New York, pp 26–35
Chapter Google Scholar
Raha A, Jayakumar H, Sutar S, Vijay R (2015) Quality-aware data allocation in approximate dram. In: Proceedings of the 2015 international conference on compilers, architecture and synthesis for embedded systems. IEEE Press, New York, pp 89–98
Google Scholar
Rehman S, El-Harouni W, Shafique M, Kumar A, Henkel J (2016) Architectural-space exploration of approximate multipliers. In: International conference on computer-aided design, pp 1–6
Google Scholar
Sampaio F, Shafique M, Zatt B, Bampi S, Henkel J (2015) Approximation-aware multi-level cells STT-RAM cache architecture. In: 2015 International conference on compilers, architecture and synthesis for embedded systems (CASES). IEEE, Piscataway, pp 79–88
Chapter Google Scholar
Sampson A, Dietl W, Fortuna E, Gnanapragasam D, Ceze L, Grossman D (2011) EnerJ: approximate data types for safe and general low-power computation. In: ACM SIGPLAN notices, vol 46. ACM, New York, pp 164–174
Google Scholar
Shafique M, Ahmad W, Hafiz R, Henkel J (2015) A low latency generic accuracy configurable adder. In: 2015 52nd ACM/EDAC/IEEE design automation conference (DAC). IEEE, Piscataway, pp 1–6
Google Scholar
Shafique M, Sampaio F, Zatt B, Bampi S, Henkel J (2015) Resilience-driven STT-RAM cache architecture for approximate computing. In: Workshop on approximate computing (AC), Paderborn
Google Scholar
Shim B, Shanbhag NR (2006) Energy-efficient soft error-tolerant digital signal processing. IEEE Trans Very Large Scale Integr Syst 14(4):336–348
Article Google Scholar
Shim B, Sridhara SR, Shanbhag NR (2004) Reliable low-power digital signal processing via reduced precision redundancy. IEEE Trans Very Large Scale Integr Syst 12(5):497–510
Article Google Scholar
Sze V, Chen YH, Yang TJ, Emer JS (2017) Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE 105(12):2295–2329
Article Google Scholar
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Google Scholar
Teimoori MT, Hanif MA, Ejlali A, Shafique M (2018) AdAM: adaptive approximation management for the non-volatile memory hierarchies. In: Design, automation test in Europe conference exhibition (DATE), 2018
Google Scholar
Vedaldi A, Lenc K (2015) Matconvnet – convolutional neural networks for matlab. In: Proceeding of the ACM International Conference on Multimedia
Google Scholar
Yang Z, Jain A, Liang J, Han J, Lombardi F (2013) Approximate XOR/XNOR-based adders for inexact computing. In: 13th IEEE conference on nanotechnology, pp 690–693
Google Scholar
Zhu N, Goh WL, Yeo KS (2009) An enhanced low-power high-speed adder for error-tolerant application. In Proceedings of 12th symposium on integrated circuits, pp 69–72
Google Scholar

Download references

Author information

Authors and Affiliations

Vienna University of Technology (TU Wien), Vienna, Austria
Muhammad Abdullah Hanif, Semeen Rehman & Muhammad Shafique
Information Technology University (ITU), Lahore, Pakistan
Muhammad Usama Javed & Rehan Hafiz

Authors

Muhammad Abdullah Hanif
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Usama Javed
View author publications
You can also search for this author in PubMed Google Scholar
Rehan Hafiz
View author publications
You can also search for this author in PubMed Google Scholar
Semeen Rehman
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Shafique
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Muhammad Abdullah Hanif .

Editor information

Editors and Affiliations

Brown University, Rhode Island, Providence, USA
Sherief Reda
Vienna University of Technology, Wien, Wien, Austria
Muhammad Shafique

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Hanif, M.A., Javed, M.U., Hafiz, R., Rehman, S., Shafique, M. (2019). Hardware–Software Approximations for Deep Neural Networks. In: Reda, S., Shafique, M. (eds) Approximate Circuits. Springer, Cham. https://doi.org/10.1007/978-3-319-99322-5_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-99322-5_13
Published: 06 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99321-8
Online ISBN: 978-3-319-99322-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics