Skip to main content

Partition and Scheduling Algorithms for Neural Network Accelerators

  • Conference paper
  • First Online:
Advanced Parallel Processing Technologies (APPT 2019)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11719))

Included in the following conference series:

Abstract

In recent years, Artificial Neural Networks have evolved rapidly and are applied to various fields. Meanwhile, to enhance computation efficiency of neural network applications, more and more neural network accelerators have been developed. Though traditional task scheduling algorithms on heterogeneous systems have been intensively researched, they can’t be applied to neural network accelerators directly. Based on typical characteristics of neural network accelerators, we formalize the problem of tasks scheduling for neural networks, and transplant two listing heuristic scheduling algorithms, Heterogeneous-Earliest-Finish-Time (HEFT) and Critical-Path-on-a-Processor (CPOP). Inspired by the separable features of neural network operations, we propose two partition algorithms, the Iterative Partition Scheduling Algorithm (IPS) and the Partition Scheduling Combination Algorithm (PSC), which can be associated with scheduling algorithms. Further, we conduct experiments on some typical neural networks, and results show that compared to scheduling-only algorithms the partition associated algorithms achieve about 2x to 3x speedup.

This work is partially supported by the National Key Research and Development Program of China (under Grant 2017YFB1003101), the NSF of China (under Grants 61432016, 61532016, 61672491, 61602441, 61602446, 61732002, 61702478, 61732007 and 61732020), Beijing Natural Science Foundation (JQ18013), the 973 Program of China (under Grant 2015CB358800), National Science and Technology Major Project (2018ZX01031102), the Transformation and Transfer of Scientific and Technological Achievements of Chinese Academy of Sciences (KFJ-HGZX-013), Key Research Projects in Frontier Science of Chinese Academy of Sciences (QYZDB-SSW-JSC001), Strategic Priority Research Program of Chinese Academy of Science (XDB32050200, XDC01020000) and Standardization Research Project of Chinese Academy of Sciences (BZ201800001).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Amodei, D., et al.: Deep speech 2: end-to-end speech recognition in English and mandarin. In: International Conference on Machine Learning, pp. 173–182 (2016)

    Google Scholar 

  2. Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification, pp. 1701–1708 (2014)

    Google Scholar 

  3. Bojarski, M., et al.: End to end learning for self-driving cars. arXiv: Computer Vision and Pattern Recognition (2016)

  4. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: International Conference on Learning Representations (2015)

    Google Scholar 

  5. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Neural Information Processing Systems, vol. 141, no. 5, pp. 1097–1105 (2012)

    Google Scholar 

  6. Gschwind, M.K., Salapura, V., Maischberger, O.: Space efficient neural net implementation (1994)

    Google Scholar 

  7. Ovtcharov, K., Ruwase, O., Kim, J.Y., Fowers, J., Strauss, K., Chung, E.S.: Accelerating deep convolutional neural networks using specialized hardware. Miscellaneous (2015)

    Google Scholar 

  8. Mittal, S.: A survey of FPGA-based accelerators for convolutional neural networks. Neural Comput. Appl. 1–31 (2018)

    Google Scholar 

  9. Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: ImageNet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_32

    Chapter  Google Scholar 

  10. Sebastian, A., et al.: Temporal correlation detection using computational phase-change memory. Nat. Commun. 8(1), 1115 (2017)

    Article  Google Scholar 

  11. Rios, C.E.C., et al.: In-memory computing on a photonic platform. Sci. Adv. 5(2), eaau5759 (2019)

    Article  Google Scholar 

  12. Jouppi, N.P., Borchers, A., Boyle, R., Cantin, P.L., Nan, B.: In-datacenter performance analysis of a tensor processing unit (2017)

    Google Scholar 

  13. Ullman, J.D.: NP-complete scheduling problems. J. Comput. Syst. Sci. 10(3), 384–393 (1975)

    Article  MathSciNet  Google Scholar 

  14. Topcuoglu, H.R., Hariri, S., Wu, M.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 260–274 (2002)

    Article  Google Scholar 

  15. Mittal, S.: A survey on optimized implementation of deep learning models on the NVIDIA Jetson platform. J. Syst. Archit. 97, 428–442 (2019)

    Article  Google Scholar 

  16. Zhang, C., Li, P., Sun, G., Guan, Y., Xiao, B., Cong, J.: Optimizing FPGA-based accelerator design for deep convolutional neural networks, pp. 161–170 (2015)

    Google Scholar 

  17. Chen, T., et al.: DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. ACM Sigplan Not. 49(4), 269–284 (2014)

    Google Scholar 

  18. Aimar, A., et al.: NullHop: a flexible convolutional neural network accelerator based on sparse representations of feature maps. IEEE Trans. Neural Netw. 30(3), 644–656 (2019)

    Article  Google Scholar 

  19. Elrewini, H., Lewis, T.G.: Scheduling parallel program tasks onto arbitrary target machines. J. Parallel Distrib. Comput. 9(2), 138–153 (1990)

    Article  Google Scholar 

  20. Hwang, J., Chow, Y., Anger, F., Lee, C.: Scheduling precedence graphs in systems with interprocessor communication times. SIAM J. Comput. 18(2), 244–257 (1989)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tian Zhi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chen, X. et al. (2019). Partition and Scheduling Algorithms for Neural Network Accelerators. In: Yew, PC., Stenström, P., Wu, J., Gong, X., Li, T. (eds) Advanced Parallel Processing Technologies. APPT 2019. Lecture Notes in Computer Science(), vol 11719. Springer, Cham. https://doi.org/10.1007/978-3-030-29611-7_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-29611-7_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-29610-0

  • Online ISBN: 978-3-030-29611-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics