Skip to main content

Hardware–Software Approximations for Deep Neural Networks

  • Chapter
  • First Online:
Approximate Circuits

Abstract

Neural networks (NNs) are the state of the art for many artificial intelligence (AI) applications. However, in order to facilitate the training process, most of the neural networks are over-parameterized and result in significant computational and memory overheads. Therefore, to alleviate the computational and memory requirements of these NNs, numerous optimization techniques have been proposed. In this chapter, we highlight one of the prominent paradigms, i.e., approximate computing, that can significantly improve the resource requirements of these networks. We describe a sensitivity analysis methodology for estimating the significance sub-parts of the state-of-the-art NNs. Based upon the significance analysis, we then present a methodology for employing tolerable amount of approximations at various stages of the network, i.e., removal of ineffectual filters/neurons at the software layer and precision reduction and memory approximations at the hardware layer. Towards the end of this chapter, we also highlight few of the prominent challenges in adopting different types of approximation and the effects that they have on the overall efficiency and accuracy of the baseline networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Almurib HAF, Kumar TN, Lombardi F (2016) Inexact designs for approximate low power addition by cell replacement. In: 2016 design, automation test in Europe conference exhibition (DATE), pp 660–665

    Google Scholar 

  2. Alorda B, Torrens G, Bota S, Segura J (2011) 8t vs. 6t sram cell radiation robustness: a comparative analysis. Microelectron Reliab 51(2):350–359

    Article  Google Scholar 

  3. Anwar S, Hwang K, Sung W (2017) Structured pruning of deep convolutional neural networks. ACM J Emerg Technol Comput Syst 13(3):32

    Article  Google Scholar 

  4. Chen W, Wilson J, Tyree S, Weinberger K, Chen Y (2015) Compressing neural networks with the hashing trick. In: International conference on machine learning, pp 2285–2294

    Google Scholar 

  5. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition. CVPR 2009. IEEE, Piscataway, pp 248–255

    Google Scholar 

  6. Dreslinski RG, Wieckowski M, Blaauw D, Sylvester D, Mudge T (2010) Near-threshold computing: reclaiming Moore’s law through energy efficient integrated circuits. Proc IEEE 98(2):253–266

    Article  Google Scholar 

  7. Gupta V, Mohapatra D, Park SP, Raghunathan A, Roy K (2011) Impact: imprecise adders for low-power approximate computing. In: Proceedings of the 17th IEEE/ACM international symposium on low-power electronics and design. IEEE Press, New York, pp 409–414

    Chapter  Google Scholar 

  8. Gupta S, Agrawal A, Gopalakrishnan K, Narayanan P (2015) Deep learning with limited numerical precision. In: Proceedings of the 32nd international conference on machine learning (ICML-15), pp 1737–1746

    Google Scholar 

  9. Han S, Mao H, Dally WJ (2015) Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. arXiv:1510.00149

    Google Scholar 

  10. Hanif MA, Hafiz R, Hasan O, Shafique M (2017) Quad: design and analysis of quality-area optimal low-latency approximate adders. In: 2017 54th ACM/EDAC/IEEE design automation conference (DAC), pp 1–6

    Google Scholar 

  11. Hanif MA, Hafiz R, Shafique M (2018) Error resilience analysis for systematically employing approximate computing in convolutional neural networks. In: 2018 design, automation & test in Europe conference & exhibition (DATE), Dresden, pp. 913–916

    Google Scholar 

  12. Hashemi S, Anthony N, Tann H, Bahar RI, Reda S (2017) Understanding the impact of precision quantization on the accuracy and energy of neural networks. In: 2017 design, automation & test in europe conference & exhibition (DATE). IEEE, Piscataway, pp 1474–1479

    Chapter  Google Scholar 

  13. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

    Google Scholar 

  14. Hegde R, Shanbhag NR (2001) Soft digital signal processing. IEEE Trans Very Large Scale Integr Syst 9(6):813–823

    Article  Google Scholar 

  15. Inacio C, Ombres D (1996) The DSP decision: fixed point or floating? IEEE Spectr 33(9):72–74

    Article  Google Scholar 

  16. Jiang H, Han J, Qiao F, Lombardi F (2016) Approximate radix-8 booth multipliers for low-power and high-performance operation. IEEE Trans Comput 65:2638–2644

    Article  MathSciNet  Google Scholar 

  17. Jiang H, Liu C, Maheshwari N, Lombardi F, Han J (2016) A comparative evaluation of approximate multipliers. In: IEEE/ACM international symposium on nanoscale architectures (NANOARCH). IEEE, Piscataway, pp 191–196

    Google Scholar 

  18. Jouppi NP, Young C, Patil N, Patterson D, Agrawal G, Bajwa R, Bates S, Bhatia S, Boden N, Borchers A et al (2017) In-datacenter performance analysis of a tensor processing unit. In: Proceedings of the 44th annual international symposium on computer architecture. ACM, New York, pp. 1–12

    Google Scholar 

  19. Kahng AB, Kang S (2012) Accuracy-configurable adder for approximate arithmetic designs. In: Proceedings of the 49th annual design automation conference, pp 820–825

    Google Scholar 

  20. Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pp 1097–1105

    Google Scholar 

  21. Leon-Garcia A (2007) Probability and random processes for EE’s, 3rd edn. Prentice-Hall, Upper Saddle River

    Google Scholar 

  22. Li H, Kadav A, Durdanovic I, Samet H, Graf HP (2016) Pruning filters for efficient convnets. arXiv:1608.08710

    Google Scholar 

  23. Liu C, Han J, Lombardi F (2014) A low-power, high-performance approximate multiplier with configurable partial error recovery. In: Proceedings of the conference on design, automation & test in Europe , 95 pp

    Google Scholar 

  24. Mahdiani HR, Ahmadi A, Fakhraie SM, Lucas C (2010) Bio-inspired imprecise computational blocks for efficient VLSI implementation of soft-computing applications. IEEE Trans Circuits Syst I: Regul Pap 57(4):850–862

    Article  MathSciNet  Google Scholar 

  25. Maheshwari N, Yang Z, Han J, Lombardi F (2015) A design approach for compressor based approximate multipliers. In: Proceedings of 28th international conference on VLSI design, pp 209–214

    Google Scholar 

  26. Parashar A, Rhu M, Mukkara A, Puglielli A, Venkatesan R, Khailany B, Emer J, Keckler SW, Dally WJ (2017) SCNN: an accelerator for compressed-sparse convolutional neural networks. In: Proceedings of the 44th annual international symposium on computer architecture. ACM, New York, pp 27–40

    Google Scholar 

  27. Qiu J, Wang J, Yao S, Guo K, Li B, Zhou E, Yu J, Tang T, Xu N, Song S et al (2016) Going deeper with embedded FPGA platform for convolutional neural network. In: Proceedings of the 2016 ACM/SIGDA international symposium on field-programmable gate arrays. ACM, New York, pp 26–35

    Chapter  Google Scholar 

  28. Raha A, Jayakumar H, Sutar S, Vijay R (2015) Quality-aware data allocation in approximate dram. In: Proceedings of the 2015 international conference on compilers, architecture and synthesis for embedded systems. IEEE Press, New York, pp 89–98

    Google Scholar 

  29. Rehman S, El-Harouni W, Shafique M, Kumar A, Henkel J (2016) Architectural-space exploration of approximate multipliers. In: International conference on computer-aided design, pp 1–6

    Google Scholar 

  30. Sampaio F, Shafique M, Zatt B, Bampi S, Henkel J (2015) Approximation-aware multi-level cells STT-RAM cache architecture. In: 2015 International conference on compilers, architecture and synthesis for embedded systems (CASES). IEEE, Piscataway, pp 79–88

    Chapter  Google Scholar 

  31. Sampson A, Dietl W, Fortuna E, Gnanapragasam D, Ceze L, Grossman D (2011) EnerJ: approximate data types for safe and general low-power computation. In: ACM SIGPLAN notices, vol 46. ACM, New York, pp 164–174

    Google Scholar 

  32. Shafique M, Ahmad W, Hafiz R, Henkel J (2015) A low latency generic accuracy configurable adder. In: 2015 52nd ACM/EDAC/IEEE design automation conference (DAC). IEEE, Piscataway, pp 1–6

    Google Scholar 

  33. Shafique M, Sampaio F, Zatt B, Bampi S, Henkel J (2015) Resilience-driven STT-RAM cache architecture for approximate computing. In: Workshop on approximate computing (AC), Paderborn

    Google Scholar 

  34. Shim B, Shanbhag NR (2006) Energy-efficient soft error-tolerant digital signal processing. IEEE Trans Very Large Scale Integr Syst 14(4):336–348

    Article  Google Scholar 

  35. Shim B, Sridhara SR, Shanbhag NR (2004) Reliable low-power digital signal processing via reduced precision redundancy. IEEE Trans Very Large Scale Integr Syst 12(5):497–510

    Article  Google Scholar 

  36. Sze V, Chen YH, Yang TJ, Emer JS (2017) Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE 105(12):2295–2329

    Article  Google Scholar 

  37. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

    Google Scholar 

  38. Teimoori MT, Hanif MA, Ejlali A, Shafique M (2018) AdAM: adaptive approximation management for the non-volatile memory hierarchies. In: Design, automation test in Europe conference exhibition (DATE), 2018

    Google Scholar 

  39. Vedaldi A, Lenc K (2015) Matconvnet – convolutional neural networks for matlab. In: Proceeding of the ACM International Conference on Multimedia

    Google Scholar 

  40. Yang Z, Jain A, Liang J, Han J, Lombardi F (2013) Approximate XOR/XNOR-based adders for inexact computing. In: 13th IEEE conference on nanotechnology, pp 690–693

    Google Scholar 

  41. Zhu N, Goh WL, Yeo KS (2009) An enhanced low-power high-speed adder for error-tolerant application. In Proceedings of 12th symposium on integrated circuits, pp 69–72

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Muhammad Abdullah Hanif .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Hanif, M.A., Javed, M.U., Hafiz, R., Rehman, S., Shafique, M. (2019). Hardware–Software Approximations for Deep Neural Networks. In: Reda, S., Shafique, M. (eds) Approximate Circuits. Springer, Cham. https://doi.org/10.1007/978-3-319-99322-5_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-99322-5_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-99321-8

  • Online ISBN: 978-3-319-99322-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics