Skip to main content

Energy Efficient RRAM Crossbar-Based Approximate Computing for Smart Cameras

  • Chapter
  • First Online:

Abstract

Smart cameras have been applied successfully in many fields. The limited battery capacity and power efficiency restrict the local processing capacity of smart cameras. In order to shift vision processing closer to the sensors, we propose a power efficient framework for analog approximate computing with the emerging metal-oxide resistive switching random-access memory (RRAM) devices. A programmable RRAM-based approximate computing unit (RRAM-ACU) is introduced first to accelerate approximated computation, and a scalable approximate computing framework is then proposed on top of the RRAM-ACU. In order to program the RRAM-ACU efficiently, we also present a detailed configuration flow, which includes a customized approximator training scheme, an approximator-parameter-to-RRAM-state mapping algorithm, and an RRAM state tuning scheme. Simulation results on a set of diverse benchmarks demonstrate that, compared with an x86-64 CPU at 2 GHz, the RRAM-ACU is able to achieve 4.06–196.41× speedup and power efficiency of 24.59–567.98 GFLOPS/W with quality loss of 8.72 % on average. The implementation of HMAX application further demonstrates that the proposed RRAM-based approximate computing framework can achieve > 12. 8× power efficiency than the digital implementation counterparts (CPU, GPU, and FPGA).

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    A neural network will tend to overfit when many weights of the network are large [28]. Overfitting is a problem that the model learns too much, including the noise, from the training data. The trained model will have poor predictive performance on the unknown testing data which are not covered by the training set.

  2. 2.

    We use 2 regularization in the training scheme. Regularization is a technique widely used in the neural network training to limit the amplitude of network weight, avoid overfitting, and improve model generalization [28]. To be specific, for the 2 regularization, a penalty of the square of the 2-norm of network weights will be proportionally added to the loss function of the network. So the error of the network and the amplitude of weights will be balanced and optimized simultaneously in the training process [28].

References

  1. Graf R, Belbachir A, King R, Mayerhofer M (2013) Quality control of real-time panoramic views from the smart camera 360 scan. In: 2013 I.E. international symposium on circuits and systems (ISCAS), pp.650–653

    Google Scholar 

  2. Esmaeilzadeh H, Sampson A, Ceze L, Burger D (2012) Neural acceleration for general-purpose approximate programs. In: International symposium on microarchitecture(MICRO), pp 449–460

    Google Scholar 

  3. DARPA (2012) Power efficiency revolution for embedded computing technologies [Online]. Available: https://www.fbo.gov/

  4. NVIDIA Tesla K-Series, DATASHEET (2012) Kepler family product overview [Online]. Available: http://www.nvidia.com/content/tesla/pdf/tesla-kseries-overview-lr.pdf

  5. Intel. (2016) Intel microprocessor export compliance metrics

    Google Scholar 

  6. Esmaeilzadeh H, Blem E, Aman RS, Sankaralingam K, Burger D (2011) Dark silicon and the end of multicore scaling. In: 2011 38th annual international symposium on computer architecture (ISCA). IEEE, pp 365–376

    Google Scholar 

  7. Li B, Shan Y, Hu M, Wang Y, Chen Y, Yang H, Memristor-based approximated computation. In: Low power electronics and design (ISLPED), pp 242–247

    Google Scholar 

  8. Xu C, Dong X, Jouppi NP, Xie Y (2011) Design implications of memristor-based RRAM cross-point structures. In: Design, automation & test in Europe conference & exhibition (DATE). IEEE, pp 1–6

    Google Scholar 

  9. Jo SH, Chang T, Ebong I, Bhadviya BB, Mazumder P, Lu W (2010) Nanoscale memristor device as synapse in neuromorphic systems. Nano Lett 10(4):1297–1301

    Article  Google Scholar 

  10. Hu M, Li H, Wu Q, Rose GS (2012) Hardware realization of BSB recall function using memristor crossbar arrays. In: Design automation conference, pp 498–503

    Google Scholar 

  11. Chakradhar S, Raghunathan A (2010) Best-effort computing: re-thinking parallel software and hardware. In: 47th ACM/IEEE design automation conference (DAC), pp 865–870

    Google Scholar 

  12. Ye R, Wang T, Yuan F, Kumar R, Xu Q (2013) On reconfiguration-oriented approximate adder design and its application. In: Proceedings of the international conference on computer-aided design. IEEE, pp 48–54

    Google Scholar 

  13. Venkataramani S, Chippa VK, Chakradhar ST, Roy K, Raghunathan A (2013) Quality programmable vector processors for approximate computing. In: Proceedings of the 46th annual IEEE/ACM international symposium on microarchitecture. ACM, pp 1–12

    Google Scholar 

  14. Wong HSP, Lee H-Y, Yu S, Chen Y-S, Wu Y, Chen P-S, Lee B, Chen F, Tsai M-J (2012) Metal-oxide RRAM. Proc IEEE 100(6):1951–1970

    Article  Google Scholar 

  15. Yu S, Gao B, Fang Z, Yu H, Kang J, Wong H-SP, (2013) A low energy oxide-based electronic synaptic device for neuromorphic visual systems with tolerance to device variation. Adv Mater 25(12):1774–1779

    Article  Google Scholar 

  16. Deng Y, Huang P, Chen B, Yang X, Gao B, Wang J, Zeng L, Du G, Kang J, Liu X (2013) RRAM crossbar array with cell selection device: a device and circuit interaction study. IEEE Trans Electron Devices 60(2):719–726

    Article  Google Scholar 

  17. Alibart F, Gao L, Hoskins BD, Strukov DB (2012) High precision tuning of state for memristive devices by adaptable variation-tolerant algorithm. Nanotechnology 23(7):075201

    Article  Google Scholar 

  18. Guan X, Yu S, Wong H-S (2012) A spice compact model of metal oxide resistive switching memory with variations. IEEE Electron Device Lett 33(10):1405–1407

    Article  Google Scholar 

  19. Hornik K, Stinchcombe M, White H (1989) Multilayer feedforward networks are universal approximators. Neural Netw 2(5):359–366

    Article  Google Scholar 

  20. Ito Y (1994) Approximation capability of layered neural networks with sigmoid units on two layers. Neural Comput 6(6):1233–1243

    Article  MATH  Google Scholar 

  21. Gu P, Li B, Tang T, Yu S, Cao Y, Wang Y, Yang H (2015) Technological exploration of RRAM crossbar array for matrix-vector multiplication. In: The 20th Asia and south pacific design automation conference (ASPDAC). IEEE, pp 106–111

    Google Scholar 

  22. Cannizzaro SO, Grasso AD, Mita R, Palumbo G, Pennisi S (2007) Design procedures for three-stage CMOS OTAs with nested-Miller compensation. IEEE Trans Circuits Syst I Regul Pap 54(5):933–940

    Article  Google Scholar 

  23. Oh W, Bakkaloglu B (2007) A CMOS low-dropout regulator with current-mode feedback buffer amplifier. IEEE Trans Circuits Syst II Express Briefs 54(10):922–926

    Article  Google Scholar 

  24. Allen PE, Holberg DR (2002) CMOS analog circuit design. Oxford University Press, Oxford

    Google Scholar 

  25. Li B, Wang Y, Chen Y, Li HH, Yang H (2014) Ice: inline calibration for memristor crossbar-based computing engine. In: Proceedings of the conference on design, automation & test in Europe. European Design and Automation Association, p 184

    Google Scholar 

  26. Khodabandehloo G, Mirhassani M, Ahmadi M (2012) Analog implementation of a novel resistive-type sigmoidal neuron. IEEE Trans Very Large Scale Integr (VLSI) Syst 20(4):750–754

    Article  Google Scholar 

  27. Fausett L (ed) (1994) Fundamentals of neural networks: architectures, algorithms, and applications. Prentice-Hall, Inc., Upper Saddle River

    MATH  Google Scholar 

  28. Girosi F, Jones M, Poggio T (1995) Regularization theory and neural networks architectures. Neural Comput 7(2):219–269

    Article  Google Scholar 

  29. Bedeschi F, Fackenthal R, Resta C, Donze E, Jagasivamani M, Buda E, Pellizzer F, Chow D, Cabrini A, Calvi G, Faravelli R, Fantini A, Torelli G, Mills D, Gastaldi R, Casagrande G (2009) A bipolar-selected phase change memory featuring multi-level cell storage. IEEE J Solid-State Circuits 44(1):217–227

    Article  Google Scholar 

  30. Lee H, Chen P, Wu T, Chen Y, Wang C, Tzeng P, Lin C, Chen F, Lien C, Tsai M (2008) Low power and high speed bipolar switching with a thin reactive ti buffer layer in robust HFO2 based RRAM. In: IEEE international electron devices meeting (IEDM), pp 1–4

    Google Scholar 

  31. Kannan S, Rajendran J, Karri R, Sinanoglu O (2013) Sneak-path testing of crossbar-based nonvolatile random access memories. IEEE Trans Nanotechnol 12(3):413–426

    Article  Google Scholar 

  32. ITRS (2013) International technology roadmap for semiconductors

    Google Scholar 

  33. Gulati K, Lee H-S (1998) A high-swing CMOS telescopic operational amplifier. IEEE J. Solid-State Circuits 33(12):2010–2019

    Article  Google Scholar 

  34. Kull L, Toifl T, Schmatz M, Francese PA, Menolfi C, Braendli M, Kossel M, Morf T, Andersen TM, Leblebici Y (2013) A 3.1 mw 8b 1.2 gs/s single-channel asynchronous SAR ADC with alternate comparators for enhanced speed in 32nm digital SOI CMOS. In: 2013 I.E. international solid-state circuits conference digest of technical papers (ISSCC). IEEE, pp 468–469

    Google Scholar 

  35. Lin W-T, Kuo T-H (2013) A 12b 1.6 gs/s 40 mw dac in 40 nm CMOS with > 70db SFDR over entire Nyquist bandwidth. In: 2013 I.E. international solid-state circuits conference digest of technical papers (ISSCC). IEEE, pp 474–475

    Google Scholar 

  36. Mutch J, Lowe DG (2008) Object class recognition and localization using sparse features with limited receptive fields. Int J Comput Vision 80(1):45–57

    Article  Google Scholar 

  37. Maashri AA, Debole M, Cotter M, Chandramoorthy N, Xiao Y, Narayanan V, Chakrabarti C (2012) Accelerating neuromorphic vision algorithms for recognition. In: Proceedings of the 49th annual design automation conference, pp 579–584

    Google Scholar 

  38. Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (VOC) challenge. Int. J. Comput. Vision 88(2):303–338

    Article  Google Scholar 

  39. Tang T, Xia L, Li B, Luo R, Chen Y, Wang Y, Yang H (2015) Spiking neural network with RRAM: can we use it for real-world application? In: Design, automation test in Europe conference exhibition (DATE), pp 860–865

    Google Scholar 

  40. Liu C, Yan B, Yang C, Song L, Li Z, Liu B, Chen Y, Li H, Wu Q, Jiang H (2015) A spiking neuromorphic design with resistive crossbar. In: Proceedings of the 52nd annual design automation conference. ACM, p 14

    Google Scholar 

  41. Seo J-S, Lin B, Kim M, Chen P-Y, Kadetotad D, Xu Z, Mohanty A, Vrudhula S, Yu S, Ye J et al. (2015) On-chip sparse learning acceleration with CMOS and resistive synaptic devices. IEEE Trans Nanotechnol 14(6):969–979

    Article  Google Scholar 

  42. Mazady A, Rahman MT, Forte D, Anwar M (2015) Memristor PUF—a security primitive: theory and experiment. IEEE J Emerging Sel Top Circuits Syst 5(2):222–229

    Article  Google Scholar 

  43. Bojnordi M, Ipek E (2016) Memristive Boltzmann machine: a hardware accelerator for combinatorial optimization and deep learning. In: International symposium on high performance computer architecture (HPCA)

    Google Scholar 

  44. Li B, Gu P, Shan Y, Wang Y, Chen Y, Yang H, Rram-based analog approximate computing. IEEE Trans Comput Aided Des Integr Circuits Syst 34(12):1905–1917

    Google Scholar 

  45. Kim K-H, Gaba S, Wheeler D, Cruz-Albrecht JM, Hussain T, Srinivasa N, Lu W (2011) A functional hybrid memristor crossbar-array/CMOS system for data storage and neuromorphic applications. Nano Lett 12(1):389–395

    Article  Google Scholar 

  46. Liu X, Mao M, Liu B, Li H, Chen Y, Li B, Wang Y, Jiang H, Barnell M, Wu Q et al. (2015) Reno: a high-efficient reconfigurable neuromorphic computing accelerator design. In: 2015 52nd ACM/EDAC/IEEE design automation conference (DAC). IEEE, pp 1–6

    Google Scholar 

  47. Xia L, Li B, Tang T, Gu P, Yin X, Huangfu W, Chen P-Y, Yu S, Cao Y, Wang Y, Xie Y, Yang H (2016) Mnsim: simulation platform for memristor-based neuromorphic computing system. In: Proceedings of the conference on design, automation & test in Europe. European Design and Automation Association

    Google Scholar 

  48. Li B, Xia L, Gu P, Wang Y, Yang H, Merging the interface: power, area and accuracy co-optimization for rram crossbar-based mixed-signal computing system. In: 2015 52nd ACM/EDAC/IEEE design automation conference (DAC), pp 1–6

    Google Scholar 

  49. Liu B, Li H, Chen Y, Li X, Wu Q, Huang T (2015) Vortex: variation-aware training for memristor x-bar. In: Proceedings of the 52nd annual design automation conference. ACM, p 15

    Google Scholar 

  50. Prezioso M, Merrikh-Bayat F, Hoskins B, Adam G, Likharev KK, Strukov DB (2015) Training and operation of an integrated neuromorphic network based on metal-oxide memristors. Nature 521(7550):61–64 2015.

    Article  Google Scholar 

  51. Liu B, Li H, Chen Y, Li X, Huang T, Wu Q, Barnell M, Reduction and ir-drop compensations techniques for reliable neuromorphic computing systems. In: 2014 IEEE/ACM international conference on computer-aided design (ICCAD). IEEE, pp 63–70

    Google Scholar 

  52. Wen W, Wu C-R, Hu X, Liu B, Ho T-Y, Li X, Chen Y (2015) An EDA framework for large scale hybrid neuromorphic computing systems. In: Proceedings of the 52nd annual design automation conference. ACM, p 12

    Google Scholar 

Download references

Acknowledgements

This work was supported by 973 Project 2013CB329000, National Natural Science Foundation of China (No. 61373026), Brain Inspired Computing Research, Tsinghua University (20141080934), Tsinghua University Initiative Scientific Research Program, the Importation and Development of High-Caliber Talents Project of Beijing Municipal Institutions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yu Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Wang, Y., Li, B., Xia, L., Tang, T., Yang, H. (2017). Energy Efficient RRAM Crossbar-Based Approximate Computing for Smart Cameras. In: Kyung, CM., Yasuura, H., Liu, Y., Lin, YL. (eds) Smart Sensors and Systems. Springer, Cham. https://doi.org/10.1007/978-3-319-33201-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-33201-7_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-33200-0

  • Online ISBN: 978-3-319-33201-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics