Advertisement

Introduction

  • Mingu Kang
  • Sujan Gonugondla
  • Naresh R. Shanbhag
Chapter
  • 204 Downloads

Abstract

There is much interest in incorporating artificial intelligence (AI) capabilities into various products and services in both the commercial and defense industries these days. Such tasks are realized in the Cloud today due to the availability of sufficient computational resources. However, there is growing interest in embedding data analytics into sensor-rich platforms at the Edge including wearables, autonomous vehicles, personal biomedical devices, Internet of Things (IoT) devices and others, to provide them with local decision-making capabilities. Such platforms, though (sensory) data-rich, are heavily constrained in terms of computational resources (storage, processing, and communications), energy, latency, and form factor. This book describes a unique approach to realizing that objective—the Deep In-memory Architecture (DIMA).

Keywords

In-memory Deep in-memory Near-memory Machine learning Accelerators von Neumann architecture Sensor Edge devices IoT Artificial intelligence Wearables Biomedical 

References

  1. 1.
    D. Silver, A. Huang, C.J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot et al., Mastering the game of Go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016)CrossRefGoogle Scholar
  2. 2.
    A. Krizhevsky, I. Sutskever, G.E. Hinton, Imagenet classification with deep convolutional neural networks, in Advances in Neural Information Processing Systems (NIPS) (2012), pp. 1097–1105Google Scholar
  3. 3.
    J. Baliga, R.W. Ayre, K. Hinton, R.S. Tucker, Green cloud computing: balancing energy in processing, storage, and transport. Proc. IEEE 99(1), 149–167 (2011)CrossRefGoogle Scholar
  4. 4.
    Y.-H. Chen, T. Krishna, J.S. Emer, V. Sze, Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE J. Solid State Circuits 52(1), 127–138 (2017)CrossRefGoogle Scholar
  5. 5.
    B. Moons, R. Uytterhoeven, W. Dehaene, M. Verhelst, Envision: a 0.26-to-10TOPS/W subword-parallel dynamic-voltage-accuracy-frequency-scalable convolutional neural network processor in 28nm FDSOI, in IEEE International Solid-State Circuits Conference (ISSCC) (2017), pp. 246–247Google Scholar
  6. 7.
    P.N. Whatmough, S.K. Lee, H. Lee, S. Rama, D. Brooks, G.-Y. Wei, A 28nm SoC with a 1.2 GHz 568nJ/prediction sparse deep-neural-network engine with >0.1 timing error rate tolerance for IoT applications, in IEEE International Solid-State Circuits Conference (ISSCC) (2017), pp. 242–243Google Scholar
  7. 9.
    T. Chen, Z. Du, N. Sun, J. Wang, C. Wu, Y. Chen, O. Temam, Diannao: a small-footprint high-throughput accelerator for ubiquitous machine-learning, in ACM Sigplan Notices, vol. 49, no. 4 (2014), pp. 269–284Google Scholar
  8. 10.
    M. Horowitz, Computing’s energy problem (and what we can do about it), in IEEE International Solid-State Circuits Conference (ISSCC) (2014), pp. 10–14Google Scholar
  9. 11.
    A. Firoozshahian, Smart Memories: A Reconfigurable Memory System Architecture (ProQuest, Ann Arbor, 2009)Google Scholar
  10. 12.
    K. Mai, T. Paaske, N. Jayasena, R. Ho, W.J. Dally, M. Horowitz, Smart memories: a modular reconfigurable architecture, in ACM SIGARCH Computer Architecture News, vol. 28 (ACM, New York, 2000), pp. 161–171Google Scholar
  11. 13.
    K. Mai, R. Ho, E. Alon, D. Liu, Y. Kim, D. Patil, M.A. Horowitz, Architecture and circuit techniques for a 1.1-GHz 16-kb reconfigurable memory in 0.18-μm CMOS. IEEE J. Solid State Circuits 40(1), 261–275 (2005)Google Scholar
  12. 14.
    D. Patterson, T. Anderson, N. Cardwell, R. Fromm, K. Keeton, C. Kozyrakis, R. Thomas, K. Yelick, Intelligent RAM (IRAM): chips that remember and compute, in IEEE International Solid-State Circuits Conference (ISSCC) (IEEE, Piscataway, 1997), pp. 224–225Google Scholar
  13. 15.
    Y.-H. Chen, J. Emer, V. Sze, Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks, in International Symposium on Computer Architecture (ISCA) (2016), pp. 367–379Google Scholar
  14. 16.
    M. Price, J. Glass, A.P. Chandrakasan, A scalable speech recognizer with deep-neural-network acoustic models and voice-activated power gating, in IEEE International Solid-State Circuits Conference (ISSCC) (2017), pp. 244–245Google Scholar
  15. 17.
    D. Ernst, N.S. Kim, S. Das, S. Pant, R. Rao, T. Pham, C. Ziesler, D. Blaauw, T. Austin, K. Flautner et al., RAZOR: a low-power pipeline based on circuit-level timing speculation, in IEEE/ACM International Symposium on Microarchitecture (MICRO) (2003), pp. 7–18Google Scholar
  16. 18.
    J.P. Kulkarni, K. Kim, K. Roy, A 160 mV robust Schmitt trigger based subthreshold SRAM. IEEE J. Solid State Circuits 42(10), 2303–2313 (2007)CrossRefGoogle Scholar
  17. 19.
    B. Zhai, D. Blaauw, D. Sylvester, S. Hanson, A sub-200mV 6T SRAM in 0.13 μm CMOS, in IEEE International Solid-State Circuits Conference (ISSCC) (IEEE, Piscataway, 2007), pp. 332–606Google Scholar
  18. 20.
    F. Frustaci, M. Khayatzadeh, D. Blaauw, D. Sylvester, M. Alioto, SRAM for error-tolerant applications with dynamic energy-quality management in 28 nm CMOS. IEEE J. Solid State Circuits 50(5), 1310–1323 (2015)CrossRefGoogle Scholar
  19. 21.
    F. Frustaci, D. Blaauw, D. Sylvester, M. Alioto, Approximate SRAMs with dynamic energy-quality management. IEEE Trans. Very Large Scale Integr. Syst. 24(6), 2128–2141 (2016)CrossRefGoogle Scholar
  20. 22.
    K. Bong, S. Choi, C. Kim, S. Kang, Y. Kim, H.-J. Yoo, A 0.62 mW ultra-low-power convolutional-neural-network face-recognition processor and a CIS integrated with always-on Haar-like face detector, in IEEE International Solid-State Circuits Conference (ISSCC) (2017), pp. 248–249Google Scholar
  21. 23.
    H.J. Mattausch, T. Gyohten, Y. Soda, T. Koide, Compact associative-memory architecture with fully parallel search capability for the minimum Hamming distance. IEEE J. Solid State Circuits 37(2), 218–227 (2002)CrossRefGoogle Scholar
  22. 24.
    Y. Oike, M. Ikeda, K. Asada, A high-speed and low-voltage associative co-processor with exact Hamming/Manhattan-distance estimation using word-parallel and hierarchical search architecture. IEEE J. Solid State Circuits 39(8), 1383–1387 (2004)CrossRefGoogle Scholar
  23. 25.
    R. Genov, G. Cauwenberghs, Kerneltron: support vector “machine” in silicon. IEEE Trans. Neural Netw. 14(5), 1426–1434 (2003)CrossRefGoogle Scholar
  24. 26.
    S. Aga, S. Jeloka, A. Subramaniyan, S. Narayanasamy, D. Blaauw, R. Das, Compute caches, in IEEE International Symposium on High Performance Computer Architecture (HPCA) (2017), pp. 481–492Google Scholar
  25. 27.
    J. Wang, X. Wang, C. Eckert, A. Subramaniyan, R. Das, D. Blaauw, D. Sylvester, A compute sram with bit-serial integer/floating-point operations for programmable in-memory vector acceleration, in IEEE International Solid-State Circuits Conference (ISSCC) (IEEE, Piscataway, 2019), pp. 224–226Google Scholar
  26. 28.
    D. Bankman, L. Yang, B. Moons, M. Verhelst, B. Murmann, An always-on 3.8uJ/86% CIFAR-10 mixed-signal binary CNN processor with all memory on chip in 28-nm CMOS. IEEE J. Solid State Circuits 54(1), 158–172 (2018)Google Scholar
  27. 29.
    M. Kang, M.-S. Keel, N.R. Shanbhag, S. Eilert, K. Curewitz, An energy-efficient VLSI architecture for pattern recognition via deep embedding of computation in SRAM, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2014), pp. 8326–8330Google Scholar
  28. 31.
    J. Zhang, Z. Wang, N. Verma, In-memory computation of a machine-learning classifier in a standard 6T SRAM array. IEEE J. Solid State Circuits 52(4), 915–924 (2017)CrossRefGoogle Scholar
  29. 32.
    M. Kang, S.K. Gonugondla, A. Patil, N.R. Shanbhag, A multi-functional in-memory inference processor using a standard 6T SRAM array. IEEE J. Solid State Circuits 53(2), 642–655 (2018)CrossRefGoogle Scholar
  30. 33.
    M. Kang, S.K. Gonugondla, S. Lim, N.R. Shanbhag, A 19.4-nJ/decision, 364-K decisions/s, in-memory random forest multi-class inference accelerator. IEEE J. Solid State Circuits 53(7), 2126–2135 (2018)Google Scholar
  31. 34.
    A. Biswas, A.P. Chandrakasan, Conv-RAM: an energy-efficient SRAM with embedded convolution computation for low-power CNN-based machine learning applications, in IEEE International Solid-State Circuits Conference (ISSCC) (2018), pp. 488–490Google Scholar
  32. 35.
    W.-H. Chen, K.-X. Li, W.-Y. Lin, K.-H. Hsu, P.-Y. Li, C.-H. Yang, C.-X. Xue, E.-Y. Yang, Y.-K. Chen, Y.-S. Chang et al., A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors, in IEEE International Solid-State Circuits Conference (ISSCC) (2018), pp. 494–496Google Scholar
  33. 36.
    W.-S. Khwa, J.-J. Chen, J.-F. Li, X. Si, E.-Y. Yang, X. Sun, R. Liu, P.-Y. Chen, Q. Li, S. Yu et al., A 65nm 4kb algorithm-dependent computing-in-memory SRAM unit-macro with 2.3 ns and 55.8 TOPS/W fully parallel product-sum operation for binary DNN edge processors, in IEEE International Solid-State Circuits Conference (ISSCC) (2018), pp. 496–498Google Scholar
  34. 37.
    T.F. Wu, H. Li, P.-C. Huang, A. Rahimi, J.M. Rabaey, H.-S.P. Wong, M.M. Shulaker, S. Mitra, Brain-inspired computing exploiting carbon nanotube FETs and resistive RAM: hyperdimensional computing case study, in IEEE International Solid-State Circuits Conference (ISSCC) (2018), pp. 492–494Google Scholar
  35. 38.
    I. Hubara, M. Courbariaux, D. Soudry, R. El-Yaniv, Y. Bengio, Binarized neural networks, in Advances in Neural Information Processing Systems (NIPS) (2016), pp. 4107–4115Google Scholar
  36. 39.
    C. Sakr, Y. Kim, N. Shanbhag, Analytical guarantees on numerical precision of deep neural networks, in International Conference on Machine Learning (ICML) (2017), pp. 3007–3016Google Scholar
  37. 40.
    B. Moons, K. Goetschalckx, N. Van Berckelae, M. Verhelst, Minimum energy quantized neural networks, in Asilomar Conference on Signals, Systems and Computer (2017)Google Scholar
  38. 41.
    M. Kang, S.K. Gonugondla, M.-S. Keel, N.R. Shanbhag, An energy-efficient memory-based high-throughput VLSI architecture for convolutional networks, in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2015)Google Scholar
  39. 42.
    M. Kang, S.K. Gonugondla, N.R. Shanbhag, A 19.4 nJ/decision 364K decisions/s in-memory random forest classifier in 6T SRAM array, in IEEE European Solid-State Circuits Conference (ESSCIRC) (2017), pp. 263–266Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Mingu Kang
    • 1
  • Sujan Gonugondla
    • 2
  • Naresh R. Shanbhag
    • 2
  1. 1.IBM T. J. Watson Research CenterOld TappanUSA
  2. 2.University of Illinois at Urbana-ChampaignUrbanaUSA

Personalised recommendations