Advertisement

Mixed Precision Quantization Scheme for Re-configurable ReRAM Crossbars Targeting Different Energy Harvesting Scenarios

  • Md Fahim Faysal Khan
  • Nicholas Anton Jao
  • Changchi Shuai
  • Keni QiuEmail author
  • Mehrdad Mahdavi
  • Vijaykrishnan NarayananEmail author
Conference paper
  • 29 Downloads
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 574)

Abstract

Crossbar arrays with non-volatile memory have recently become very popular for DNN acceleration due to their In-Memory-Computing property and low power requirements which makes them suitable for deployment on edge. Quantized Neural Networks (QNNs) enable us to run inference with limited hardware resource and power availability and can easily be ported on smaller devices. On the other hand, to make edge devices self sustainable a great deal of promise has been shown by energy harvesting scenarios. However, the power supplied by the energy harvesting sources is not constant which becomes problematic as a fixed trained neural network requires a constant amount of power to run inference. This work addresses this issue by tuning network precision at layer granularity for variable power availability predicted for different energy harvesting scenarios.

Keywords

Quantization Deep learning ReRAM Crossbar Energy harvesting Power predictor 

Notes

Acknowledgements

This work was supported in part by Semiconductor Research Corporation (SRC), Center for Brain-inspired Computing (C-BRIC), Center for Research in Intelligent Storage and Processing in Memory (CRISP), NSF Expeditions in Computing CCF-1317560, National Natural Science Foundation of China [NSFC Project No. 61872251] and Beijing Advanced Innovation Center for Imaging Technology.

References

  1. 1.
    Banner, R., Nahshan, Y., Hoffer, E., Soudry, D.: ACIQ: analytical clipping for integer quantization of neural networks. arXiv preprint arXiv:1810.05723 (2018)
  2. 2.
    Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)CrossRefGoogle Scholar
  3. 3.
    Chi, P., et al.: PRIME: a novel processing-in-memory architecture for neural network computation in reram-based main memory, June 2016Google Scholar
  4. 4.
    Choi, J., Wang, Z., Venkataramani, S., Chuang, P.I.-J., Srinivasan, V., Gopalakrishnan, K.: PACT: parameterized clipping activation for quantized neural networks. arXiv preprint arXiv:1805.06085 (2018)
  5. 5.
    Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks: training deep neural networks with weights and activations constrained to +1 or \(-\)1. arXiv preprint arXiv:1602.02830 (2016)
  6. 6.
    Dong, Z., Yao, Z., Gholami, A., Mahoney, M., Keutzer, K.: HAWQ: Hessian aware quantization of neural networks with mixed-precision. arXiv preprint arXiv:1905.03696 (2019)
  7. 7.
    Gong, Z., et al.: Retention state-enabled and progress-driven energy management for self-powered nonvolatile processors. In: 2017 IEEE 23rd International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA), pp. 1–8 (2017)Google Scholar
  8. 8.
    Jao, N., Ramanathan, A.K., Sengupta, A., Sampson, J., Narayanan, V.: Programmable non-volatile memory design featuring reconfigurable in-memory operations. In: 2019 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–5, May 2019Google Scholar
  9. 9.
    Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report, Citeseer (2009)Google Scholar
  10. 10.
    LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)CrossRefGoogle Scholar
  11. 11.
    Ma, K., Li, X., Liu, Y., Sampson, J., Xie, Y., Narayanan, V.: Dynamic machine learning based matching of nonvolatile processor microarchitecture to harvested energy profile. In: Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, pp. 670–675 (2015)Google Scholar
  12. 12.
    Ma, K., et al.: Architecture exploration for ambient energy harvesting nonvolatile processors. In: 2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA), pp. 526–537 (2015)Google Scholar
  13. 13.
    Migacz, S.: 8-bit inference with tensorrt. In: GPU Technology Conference, vol. 2, p. 7 (2017)Google Scholar
  14. 14.
    Mishra, A., Nurvitadhi, E., Cook, J.J., Marr, D.: WRPN: wide reduced-precision networks. arXiv preprint arXiv:1709.01134 (2017)
  15. 15.
    Qiu, K., Chen, W., Xu, Y., Xia, L., Wang, Y., Shao, Z.: A peripheral circuit reuse structure integrated with a retimed data flow for low power RRAM crossbar-based CNN. In: 2018 Design, Automation Test in Europe Conference Exhibition (DATE), pp. 1057–1062 (2018)Google Scholar
  16. 16.
    Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: imagenet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016).  https://doi.org/10.1007/978-3-319-46493-0_32CrossRefGoogle Scholar
  17. 17.
    Shafiee, A., et al.: ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars. In: 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), pp. 14–26, June 2016Google Scholar
  18. 18.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  19. 19.
    Khan, Md.F.F., Kamani, M.M., Mahdavi, M., Narayanan, V.: Learning to quantize deep neural networks: a competitive-collaborative approach. In: Proceedings of the 57th Annual Design Automation Conference (2020)Google Scholar
  20. 20.
    Wang, K., Liu, Z., Lin, Y., Lin, J., Han, S.: HAQ: hardware-aware automated quantization with mixed precision. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8612–8620 (2019)Google Scholar
  21. 21.
    Yazdanbakhsh, A., Elthakeb, A.T., Pilligundla, P., Mireshghallah, F., Esmaeilzadeh, H.: ReLeQ: an automatic reinforcement learning approach for deep quantization of neural networks. arXiv preprint arXiv:1811.01704 (2018)
  22. 22.
    Zhang, C., Bengio, S., Singer, Y.: Are all layers created equal? arXiv preprint arXiv:1902.01996 (2019)
  23. 23.
    Zhao, M., Qiu, K., Xie, Y., Hu, J., Xue, C.J.: Redesigning software and systems for non-volatile processors on self-powered devices. In: 2016 IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-SoC), pp. 1–6 (2016)Google Scholar
  24. 24.
    Zhou, S.-C., Wang, Y.-Z., Wen, H., He, Q.-Y., Zou, Y.-H.: Balanced quantization: an effective and efficient approach to quantized neural networks. J. Comput. Sci. Technol. 32(4), 667–682 (2017)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Zhou, S., Wu, Y., Ni, Z., Zhou, X., Wen, H., Zou, Y.: DoReFa-Net: training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv preprint arXiv:1606.06160 (016)

Copyright information

© IFIP International Federation for Information Processing 2020

Authors and Affiliations

  • Md Fahim Faysal Khan
    • 1
  • Nicholas Anton Jao
    • 1
  • Changchi Shuai
    • 2
  • Keni Qiu
    • 2
    Email author
  • Mehrdad Mahdavi
    • 3
  • Vijaykrishnan Narayanan
    • 3
    Email author
  1. 1.Department of Electrical EngineeringPennsylvania State UniversityUniversity ParkUSA
  2. 2.Capital Normal UniversityBeijingChina
  3. 3.Department of Computer Science and EngineeringPennsylvania State UniversityUniversity ParkUSA

Personalised recommendations