Advertisement

The Orlando Project: A 28 nm FD-SOI Low Memory Embedded Neural Network ASIC

  • Giuseppe Desoli
  • Valeria Tomaselli
  • Emanuele Plebani
  • Giulio Urlini
  • Danilo Pau
  • Viviana D’Alto
  • Tommaso Majo
  • Fabio De Ambroggi
  • Thomas Boesch
  • Surinder-pal Singh
  • Elio Guidetti
  • Nitin Chawla
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10016)

Abstract

The recent success of neural networks in various computer vision tasks open the possibility to add visual intelligence to mobile and wearable devices; however, the stringent power requirements are unsuitable for networks run on embedded CPUs or GPUs. To address such challenges, STMicroelectronics developed the Orlando Project, a new and low power architecture for convolutional neural network acceleration suited for wearable devices. An important contribution to the energy usage is the storage and access to the neural network parameters. In this paper, we show that with adequate model compression schemes based on weight quantization and pruning, a whole AlexNet network can fit in the local memory of an embedded processor, thus avoiding additional system complexity and energy usage, with no or low impact on the accuracy of the network. Moreover, the compression methods work well across different tasks, e.g. image classification and object detection.

Keywords

Convolutional neural networks Hardware acceleration 

References

  1. 1.
    Russakovsky, O., Deng, J., Su, H., Krause, J., et al.: ImageNet large scale visual recognition challenge. Arxiv, p. 37, September 2014Google Scholar
  2. 2.
    Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: NIPS, Lake Tahoe, pp. 1097–1105 (2012)Google Scholar
  3. 3.
    Boser, B., Sackinger, E., Bromley, J., LeCun, Y., et al.: An analog neural network processor and its application to high-speed character recognition. In: International Joint Conference on Neural Networks, Seattle, vol. i, pp. 415–420 (1991)Google Scholar
  4. 4.
    Farabet, C., Martini, B., Corda, B., Akselrod, P., et al.: NeuFlow: a runtime reconfigurable dataflow processor for vision. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (2011)Google Scholar
  5. 5.
    Gokhale, V., Jin, J., Dundar, A., Martini, B., et al.: A 240 G-ops/s mobile coprocessor for deep neural networks. In: CVPR (2014)Google Scholar
  6. 6.
    Chen, Y.H., Krishna, T., Emer, J., Sze, V.: Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. In: Proceedings of ISSCC (2016)Google Scholar
  7. 7.
    Qiu, J., Song, S., Wang, Y., Yang, H., et al.: Going deeper with embedded FPGA platform for convolutional neural network. In: Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 26–35. ACM, New York, February 2016Google Scholar
  8. 8.
    Chen, T., Du, Z., Sun, N., Wang, J., et al.: DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. In: International Conference on Architectural Support for Programming Languages and Operating Systems - ASPLOS, pp. 269–283 (2014)Google Scholar
  9. 9.
    Chen, Y., Luo, T., Liu, S., Zhang, S., et al.: DaDianNao: a machine-learning supercomputer. In: 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 609–622. IEEE. December 2014Google Scholar
  10. 10.
    Du, Z., Fasthuber, R., Chen, T., Ienne, P., et al.: ShiDianNao: shifting vision processing closer to the sensor. In: ISCA, pp. 92–104 (2015)Google Scholar
  11. 11.
    Han, S., Liu, X., Mao, H., Pu, J., et al.: EIE: Efficient Inference Engine on compressed deep neural network. In: ISCA, February 2016Google Scholar
  12. 12.
    Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman Coding. In: ICLR, San Juan, October 2016Google Scholar
  13. 13.
    Serrano-Gotarredona, R., Oster, M., Lichtsteiner, P., Linares-Barranco, A., et al.: CAVIAR: A 45k neuron, 5M synapse, 12G connects/s AER hardware sensory-processing-learning-actuating system for high-speed visual object recognition and tracking. IEEE Trans. Neural Netw. 20, 1417–1438 (2009)CrossRefGoogle Scholar
  14. 14.
    Esser, S.K., Merolla, P.A., Arthur, J.V., Cassidy, A.S., et al.: Convolutional networks for fast, energy-efficient neuromorphic computing. Arxiv, p. 7, March 2016Google Scholar
  15. 15.
    Merolla, P.A., Arthur, J.V., Alvarez-Icaza, R., Cassidy, A.S., et al.: A million spiking-neuron integrated circuit with a scalable communication network and interface. Science 345(6197), 668–673 (2014)CrossRefGoogle Scholar
  16. 16.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. Technical report, Microsoft Research, December 2015Google Scholar
  17. 17.
    Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. Technical report, Berkeley Vision and Learning Center, June 2014Google Scholar
  18. 18.
    Simonyan, K., Zisserman, A.: Very Deep convolutional networks for large-scale image recognition. Technical report, Google Research, September 2014Google Scholar
  19. 19.
    Vedaldi, A., Lenc, K.: MatConvNet - convolutional neural networks for MATLAB. Arxiv, December 2014Google Scholar
  20. 20.
    Tomè, D., Monti, F., Baroffio, L., Bondi, L., et al.: Deep convolutional neural networks for pedestrian detection. Technical report, Politecnico di Milano, October 2015Google Scholar
  21. 21.
    Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: CVPR, Columbus, November 2014Google Scholar
  22. 22.
    Hosang, J., Omran, M., Benenson, R., Schiele, B.: Taking a deeper look at pedestrians. In: CVPR, Boston, January 2015Google Scholar
  23. 23.
    Dollar, P., Appel, R., Belongie, S., Perona, P.: Fast feature pyramids for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 36(8), 1532–1545 (2014)CrossRefGoogle Scholar
  24. 24.
    Dollár, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: an evaluation of the state of the art. Pattern Anal. Mach. Intell. 34(4), 743–761 (2012)CrossRefGoogle Scholar
  25. 25.
    Novikov, A., Podoprikhin, D., Osokin, A., Vetrov, D.: Tensorizing neural networks. In: NIPS. Montreal, September 2015Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Giuseppe Desoli
    • 1
  • Valeria Tomaselli
    • 2
  • Emanuele Plebani
    • 3
  • Giulio Urlini
    • 3
  • Danilo Pau
    • 3
  • Viviana D’Alto
    • 3
  • Tommaso Majo
    • 1
  • Fabio De Ambroggi
    • 3
  • Thomas Boesch
    • 1
  • Surinder-pal Singh
    • 4
  • Elio Guidetti
    • 1
  • Nitin Chawla
    • 4
  1. 1.STMicroelectronicsCornaredoItaly
  2. 2.STMicroelectronicsCataniaItaly
  3. 3.STMicroelectronicsAgrate BrianzaItaly
  4. 4.STMicroelectronicsGreater NoidaIndia

Personalised recommendations