Partition and Scheduling Algorithms for Neural Network Accelerators

Chen, Xiaobing; Peng, Shaohui; Jin, Luyang; Zhuang, Yimin; Song, Jin; Du, Weijian; Liu, Shaoli; Zhi, Tian

doi:10.1007/978-3-030-29611-7_5

Xiaobing Chen^13,14,15,
Shaohui Peng^13,14,15,
Luyang Jin^13,14,15,
Yimin Zhuang^13,14,15,
Jin Song^13,14,15,
Weijian Du^13,14,15,
Shaoli Liu¹³ &
…
Tian Zhi¹³

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11719))

Included in the following conference series:

International Symposium on Advanced Parallel Processing Technologies

1035 Accesses
5 Citations
3 Altmetric

Abstract

In recent years, Artificial Neural Networks have evolved rapidly and are applied to various fields. Meanwhile, to enhance computation efficiency of neural network applications, more and more neural network accelerators have been developed. Though traditional task scheduling algorithms on heterogeneous systems have been intensively researched, they can’t be applied to neural network accelerators directly. Based on typical characteristics of neural network accelerators, we formalize the problem of tasks scheduling for neural networks, and transplant two listing heuristic scheduling algorithms, Heterogeneous-Earliest-Finish-Time (HEFT) and Critical-Path-on-a-Processor (CPOP). Inspired by the separable features of neural network operations, we propose two partition algorithms, the Iterative Partition Scheduling Algorithm (IPS) and the Partition Scheduling Combination Algorithm (PSC), which can be associated with scheduling algorithms. Further, we conduct experiments on some typical neural networks, and results show that compared to scheduling-only algorithms the partition associated algorithms achieve about 2x to 3x speedup.

This work is partially supported by the National Key Research and Development Program of China (under Grant 2017YFB1003101), the NSF of China (under Grants 61432016, 61532016, 61672491, 61602441, 61602446, 61732002, 61702478, 61732007 and 61732020), Beijing Natural Science Foundation (JQ18013), the 973 Program of China (under Grant 2015CB358800), National Science and Technology Major Project (2018ZX01031102), the Transformation and Transfer of Scientific and Technological Achievements of Chinese Academy of Sciences (KFJ-HGZX-013), Key Research Projects in Frontier Science of Chinese Academy of Sciences (QYZDB-SSW-JSC001), Strategic Priority Research Program of Chinese Academy of Science (XDB32050200, XDC01020000) and Standardization Research Project of Chinese Academy of Sciences (BZ201800001).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Amodei, D., et al.: Deep speech 2: end-to-end speech recognition in English and mandarin. In: International Conference on Machine Learning, pp. 173–182 (2016)
Google Scholar
Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: Deepface: closing the gap to human-level performance in face verification, pp. 1701–1708 (2014)
Google Scholar
Bojarski, M., et al.: End to end learning for self-driving cars. arXiv: Computer Vision and Pattern Recognition (2016)
Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. In: International Conference on Learning Representations (2015)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Neural Information Processing Systems, vol. 141, no. 5, pp. 1097–1105 (2012)
Google Scholar
Gschwind, M.K., Salapura, V., Maischberger, O.: Space efficient neural net implementation (1994)
Google Scholar
Ovtcharov, K., Ruwase, O., Kim, J.Y., Fowers, J., Strauss, K., Chung, E.S.: Accelerating deep convolutional neural networks using specialized hardware. Miscellaneous (2015)
Google Scholar
Mittal, S.: A survey of FPGA-based accelerators for convolutional neural networks. Neural Comput. Appl. 1–31 (2018)
Google Scholar
Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: ImageNet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_32
Chapter Google Scholar
Sebastian, A., et al.: Temporal correlation detection using computational phase-change memory. Nat. Commun. 8(1), 1115 (2017)
Article Google Scholar
Rios, C.E.C., et al.: In-memory computing on a photonic platform. Sci. Adv. 5(2), eaau5759 (2019)
Article Google Scholar
Jouppi, N.P., Borchers, A., Boyle, R., Cantin, P.L., Nan, B.: In-datacenter performance analysis of a tensor processing unit (2017)
Google Scholar
Ullman, J.D.: NP-complete scheduling problems. J. Comput. Syst. Sci. 10(3), 384–393 (1975)
Article MathSciNet Google Scholar
Topcuoglu, H.R., Hariri, S., Wu, M.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 260–274 (2002)
Article Google Scholar
Mittal, S.: A survey on optimized implementation of deep learning models on the NVIDIA Jetson platform. J. Syst. Archit. 97, 428–442 (2019)
Article Google Scholar
Zhang, C., Li, P., Sun, G., Guan, Y., Xiao, B., Cong, J.: Optimizing FPGA-based accelerator design for deep convolutional neural networks, pp. 161–170 (2015)
Google Scholar
Chen, T., et al.: DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. ACM Sigplan Not. 49(4), 269–284 (2014)
Google Scholar
Aimar, A., et al.: NullHop: a flexible convolutional neural network accelerator based on sparse representations of feature maps. IEEE Trans. Neural Netw. 30(3), 644–656 (2019)
Article Google Scholar
Elrewini, H., Lewis, T.G.: Scheduling parallel program tasks onto arbitrary target machines. J. Parallel Distrib. Comput. 9(2), 138–153 (1990)
Article Google Scholar
Hwang, J., Chow, Y., Anger, F., Lee, C.: Scheduling precedence graphs in systems with interprocessor communication times. SIAM J. Comput. 18(2), 244–257 (1989)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

SKL of Computer Architecture, Computing Technology, CAS, Beijing, China
Xiaobing Chen, Shaohui Peng, Luyang Jin, Yimin Zhuang, Jin Song, Weijian Du, Shaoli Liu & Tian Zhi
University of Chinese Academy of Sciences, Beijing, China
Xiaobing Chen, Shaohui Peng, Luyang Jin, Yimin Zhuang, Jin Song & Weijian Du
Cambricon Tech. Ltd., Shanghai, China
Xiaobing Chen, Shaohui Peng, Luyang Jin, Yimin Zhuang, Jin Song & Weijian Du

Authors

Xiaobing Chen
View author publications
You can also search for this author in PubMed Google Scholar
Shaohui Peng
View author publications
You can also search for this author in PubMed Google Scholar
Luyang Jin
View author publications
You can also search for this author in PubMed Google Scholar
Yimin Zhuang
View author publications
You can also search for this author in PubMed Google Scholar
Jin Song
View author publications
You can also search for this author in PubMed Google Scholar
Weijian Du
View author publications
You can also search for this author in PubMed Google Scholar
Shaoli Liu
View author publications
You can also search for this author in PubMed Google Scholar
Tian Zhi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tian Zhi .

Editor information

Editors and Affiliations

University of Minnesota, Minneapolis, MN, USA
Pen-Chung Yew
Chalmers University of Technology, Gothenburg, Sweden
Per Stenström
National University of Defense Technology, Changsha, China
Junjie Wu
Nankai University, Tianjin, China
Xiaoli Gong
Nankai University, Tianjin, China
Tao Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, X. et al. (2019). Partition and Scheduling Algorithms for Neural Network Accelerators. In: Yew, PC., Stenström, P., Wu, J., Gong, X., Li, T. (eds) Advanced Parallel Processing Technologies. APPT 2019. Lecture Notes in Computer Science(), vol 11719. Springer, Cham. https://doi.org/10.1007/978-3-030-29611-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-030-29611-7_5
Published: 09 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29610-0
Online ISBN: 978-3-030-29611-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the China Computer Federation (CCF) (opens in a new tab)