Abstract
Neural networks (NNs) are the state of the art for many artificial intelligence (AI) applications. However, in order to facilitate the training process, most of the neural networks are over-parameterized and result in significant computational and memory overheads. Therefore, to alleviate the computational and memory requirements of these NNs, numerous optimization techniques have been proposed. In this chapter, we highlight one of the prominent paradigms, i.e., approximate computing, that can significantly improve the resource requirements of these networks. We describe a sensitivity analysis methodology for estimating the significance sub-parts of the state-of-the-art NNs. Based upon the significance analysis, we then present a methodology for employing tolerable amount of approximations at various stages of the network, i.e., removal of ineffectual filters/neurons at the software layer and precision reduction and memory approximations at the hardware layer. Towards the end of this chapter, we also highlight few of the prominent challenges in adopting different types of approximation and the effects that they have on the overall efficiency and accuracy of the baseline networks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Almurib HAF, Kumar TN, Lombardi F (2016) Inexact designs for approximate low power addition by cell replacement. In: 2016 design, automation test in Europe conference exhibition (DATE), pp 660–665
Alorda B, Torrens G, Bota S, Segura J (2011) 8t vs. 6t sram cell radiation robustness: a comparative analysis. Microelectron Reliab 51(2):350–359
Anwar S, Hwang K, Sung W (2017) Structured pruning of deep convolutional neural networks. ACM J Emerg Technol Comput Syst 13(3):32
Chen W, Wilson J, Tyree S, Weinberger K, Chen Y (2015) Compressing neural networks with the hashing trick. In: International conference on machine learning, pp 2285–2294
Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: IEEE conference on computer vision and pattern recognition. CVPR 2009. IEEE, Piscataway, pp 248–255
Dreslinski RG, Wieckowski M, Blaauw D, Sylvester D, Mudge T (2010) Near-threshold computing: reclaiming Moore’s law through energy efficient integrated circuits. Proc IEEE 98(2):253–266
Gupta V, Mohapatra D, Park SP, Raghunathan A, Roy K (2011) Impact: imprecise adders for low-power approximate computing. In: Proceedings of the 17th IEEE/ACM international symposium on low-power electronics and design. IEEE Press, New York, pp 409–414
Gupta S, Agrawal A, Gopalakrishnan K, Narayanan P (2015) Deep learning with limited numerical precision. In: Proceedings of the 32nd international conference on machine learning (ICML-15), pp 1737–1746
Han S, Mao H, Dally WJ (2015) Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. arXiv:1510.00149
Hanif MA, Hafiz R, Hasan O, Shafique M (2017) Quad: design and analysis of quality-area optimal low-latency approximate adders. In: 2017 54th ACM/EDAC/IEEE design automation conference (DAC), pp 1–6
Hanif MA, Hafiz R, Shafique M (2018) Error resilience analysis for systematically employing approximate computing in convolutional neural networks. In: 2018 design, automation & test in Europe conference & exhibition (DATE), Dresden, pp. 913–916
Hashemi S, Anthony N, Tann H, Bahar RI, Reda S (2017) Understanding the impact of precision quantization on the accuracy and energy of neural networks. In: 2017 design, automation & test in europe conference & exhibition (DATE). IEEE, Piscataway, pp 1474–1479
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Hegde R, Shanbhag NR (2001) Soft digital signal processing. IEEE Trans Very Large Scale Integr Syst 9(6):813–823
Inacio C, Ombres D (1996) The DSP decision: fixed point or floating? IEEE Spectr 33(9):72–74
Jiang H, Han J, Qiao F, Lombardi F (2016) Approximate radix-8 booth multipliers for low-power and high-performance operation. IEEE Trans Comput 65:2638–2644
Jiang H, Liu C, Maheshwari N, Lombardi F, Han J (2016) A comparative evaluation of approximate multipliers. In: IEEE/ACM international symposium on nanoscale architectures (NANOARCH). IEEE, Piscataway, pp 191–196
Jouppi NP, Young C, Patil N, Patterson D, Agrawal G, Bajwa R, Bates S, Bhatia S, Boden N, Borchers A et al (2017) In-datacenter performance analysis of a tensor processing unit. In: Proceedings of the 44th annual international symposium on computer architecture. ACM, New York, pp. 1–12
Kahng AB, Kang S (2012) Accuracy-configurable adder for approximate arithmetic designs. In: Proceedings of the 49th annual design automation conference, pp 820–825
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pp 1097–1105
Leon-Garcia A (2007) Probability and random processes for EE’s, 3rd edn. Prentice-Hall, Upper Saddle River
Li H, Kadav A, Durdanovic I, Samet H, Graf HP (2016) Pruning filters for efficient convnets. arXiv:1608.08710
Liu C, Han J, Lombardi F (2014) A low-power, high-performance approximate multiplier with configurable partial error recovery. In: Proceedings of the conference on design, automation & test in Europe , 95 pp
Mahdiani HR, Ahmadi A, Fakhraie SM, Lucas C (2010) Bio-inspired imprecise computational blocks for efficient VLSI implementation of soft-computing applications. IEEE Trans Circuits Syst I: Regul Pap 57(4):850–862
Maheshwari N, Yang Z, Han J, Lombardi F (2015) A design approach for compressor based approximate multipliers. In: Proceedings of 28th international conference on VLSI design, pp 209–214
Parashar A, Rhu M, Mukkara A, Puglielli A, Venkatesan R, Khailany B, Emer J, Keckler SW, Dally WJ (2017) SCNN: an accelerator for compressed-sparse convolutional neural networks. In: Proceedings of the 44th annual international symposium on computer architecture. ACM, New York, pp 27–40
Qiu J, Wang J, Yao S, Guo K, Li B, Zhou E, Yu J, Tang T, Xu N, Song S et al (2016) Going deeper with embedded FPGA platform for convolutional neural network. In: Proceedings of the 2016 ACM/SIGDA international symposium on field-programmable gate arrays. ACM, New York, pp 26–35
Raha A, Jayakumar H, Sutar S, Vijay R (2015) Quality-aware data allocation in approximate dram. In: Proceedings of the 2015 international conference on compilers, architecture and synthesis for embedded systems. IEEE Press, New York, pp 89–98
Rehman S, El-Harouni W, Shafique M, Kumar A, Henkel J (2016) Architectural-space exploration of approximate multipliers. In: International conference on computer-aided design, pp 1–6
Sampaio F, Shafique M, Zatt B, Bampi S, Henkel J (2015) Approximation-aware multi-level cells STT-RAM cache architecture. In: 2015 International conference on compilers, architecture and synthesis for embedded systems (CASES). IEEE, Piscataway, pp 79–88
Sampson A, Dietl W, Fortuna E, Gnanapragasam D, Ceze L, Grossman D (2011) EnerJ: approximate data types for safe and general low-power computation. In: ACM SIGPLAN notices, vol 46. ACM, New York, pp 164–174
Shafique M, Ahmad W, Hafiz R, Henkel J (2015) A low latency generic accuracy configurable adder. In: 2015 52nd ACM/EDAC/IEEE design automation conference (DAC). IEEE, Piscataway, pp 1–6
Shafique M, Sampaio F, Zatt B, Bampi S, Henkel J (2015) Resilience-driven STT-RAM cache architecture for approximate computing. In: Workshop on approximate computing (AC), Paderborn
Shim B, Shanbhag NR (2006) Energy-efficient soft error-tolerant digital signal processing. IEEE Trans Very Large Scale Integr Syst 14(4):336–348
Shim B, Sridhara SR, Shanbhag NR (2004) Reliable low-power digital signal processing via reduced precision redundancy. IEEE Trans Very Large Scale Integr Syst 12(5):497–510
Sze V, Chen YH, Yang TJ, Emer JS (2017) Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE 105(12):2295–2329
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Teimoori MT, Hanif MA, Ejlali A, Shafique M (2018) AdAM: adaptive approximation management for the non-volatile memory hierarchies. In: Design, automation test in Europe conference exhibition (DATE), 2018
Vedaldi A, Lenc K (2015) Matconvnet – convolutional neural networks for matlab. In: Proceeding of the ACM International Conference on Multimedia
Yang Z, Jain A, Liang J, Han J, Lombardi F (2013) Approximate XOR/XNOR-based adders for inexact computing. In: 13th IEEE conference on nanotechnology, pp 690–693
Zhu N, Goh WL, Yeo KS (2009) An enhanced low-power high-speed adder for error-tolerant application. In Proceedings of 12th symposium on integrated circuits, pp 69–72
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Hanif, M.A., Javed, M.U., Hafiz, R., Rehman, S., Shafique, M. (2019). Hardware–Software Approximations for Deep Neural Networks. In: Reda, S., Shafique, M. (eds) Approximate Circuits. Springer, Cham. https://doi.org/10.1007/978-3-319-99322-5_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-99322-5_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99321-8
Online ISBN: 978-3-319-99322-5
eBook Packages: EngineeringEngineering (R0)