Abstract
Package managers, containers, automated testing, and Continuous Integration (CI), are becoming an essential part of HPC development workflows. These automated tools often require software recompilation. However, large stacks such as those deployed on HPC clusters can have combinatorial dependencies, and may take a system several days to compile. Despite the use of simple parallelization (such as ‘make -j’), build execution time often do not scale with system resources. For such cases, it is possible to improve overall installation time by compiling parts of software stack independently, each scheduled on a subset of available cores. We apply malleable-task scheduling algorithms to better exploit available parallelism in build system workflows and improve stack build time overall. Using a prototype implementation in the Spack package manager, malleable-task scheduling can improve build times by more than 2x.
Under the terms of Contract DE-NA0003525, there is a non-exclusive license for use of this work by or on behalf of the U.S. Government.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Amdahl, G.M.: Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the April 18–20, 1967, Spring Joint Computer Conference, AFIPS 1967 (Spring), pp. 483–485. ACM, New York (1967). https://doi.org/10.1145/1465482.1465560, http://doi.acm.org/10.1145/1465482.1465560
Bansal, S., Kumar, P., Singh, K.: An improved two-step algorithm for task and data parallel scheduling in distributed memory machines. Parallel Comput. 32(10), 759–774 (2006). https://doi.org/10.1016/j.parco.2006.08.004. http://www.sciencedirect.com/science/article/pii/S0167819106000524
Bartlett, R., et al.: xSDK foundations: toward an extreme-scale scientific software development kit. Supercomput. Front. Innov. 4(1) (2017). http://superfri.org/superfri/article/view/127
Coffman Jr., E.G., Graham, R.L.: Optimal scheduling for two-processor systems. Acta Informatica 1(3), 200–213 (1972). https://doi.org/10.1007/BF00288685
xSDK contributors: xsdk home (2019). https://xsdk.info/
Spack Contributors: Spack (2019). https://spack.io/. Accessed 27 Feb 2019
Du, J., Leung, J.Y.T.: Complexity of scheduling parallel task systems. SIAM J. Discrete Math. 2(4), 473–487 (1989). https://doi.org/10.1137/0402042
Gamblin, T., et al.: The Spack package manager: bringing order to HPC software chaos. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2015, pp. 40:1–40:12. ACM, New York (2015). https://doi.org/10.1145/2807591.2807623, http://doi.acm.org/10.1145/2807591.2807623
Papadimitriou, C.H., Yannakakis, M.: Scheduling interval-ordered tasks. SIAM J. Comput. 8, 405–409 (1979). https://doi.org/10.1137/0208031
Hu, T.C.: Parallel sequencing and assembly line problems. Oper. Res. 9(6), 841–848 (1961). http://www.jstor.org/stable/167050
Huang, K.C., Wu, W.Y., Wang, F.J., Liu, H.C., Hung, C.H.: An iterative expanding and shrinking process for processor allocation in mixed-parallel workflow scheduling. SpringerPlus 5(1), 1138 (2016). https://doi.org/10.1186/s40064-016-2808-y
Kwok, Y.K., Ahmad, I.: Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Comput. Surv. 31(4), 406–471 (1999). https://doi.org/10.1145/344588.344618. http://doi.acm.org/10.1145/344588.344618
Radulescu, A., van Gemund, A.J.C.: A low-cost approach towards mixed task and data parallel scheduling. In: International Conference on Parallel Processing 2001, pp. 69–76 (2001). https://doi.org/10.1109/ICPP.2001.952048
Radulescu, A., Nicolescu, C., van Gemund, A.J.C., Jonker, P.P.: CPR: mixed task and data parallel scheduling for distributed systems. In: IPDPS (2001)
Ramaswamy, S., Sapatnekar, S., Banerjee, P.: A framework for exploiting task and data parallelism on distributed memory multicomputers. IEEE Trans. Parallel Distrib. Syst. 8(11), 1098–1116 (1997). https://doi.org/10.1109/71.642945
Sethi, R.: Scheduling graphs on two processors. SIAM J. Comput. 5, 73–82 (1976). https://doi.org/10.1137/0205005
Vydyanathan, N., et al.: Locality conscious processor allocation and scheduling for mixed parallel applications. In: 2006 IEEE International Conference on Cluster Computing, pp. 1–10 (2006). https://doi.org/10.1109/CLUSTR.2006.311861
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 National Technology & Engineering Solutions of Sandia, LLC.
About this paper
Cite this paper
Knight, S., Wilke, J., Gamblin, T. (2020). Using Malleable Task Scheduling to Accelerate Package Manager Installations. In: Juckeland, G., Chandrasekaran, S. (eds) Tools and Techniques for High Performance Computing. HUST SE-HER WIHPC 2019 2019 2019. Communications in Computer and Information Science, vol 1190. Springer, Cham. https://doi.org/10.1007/978-3-030-44728-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-44728-1_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-44727-4
Online ISBN: 978-3-030-44728-1
eBook Packages: Computer ScienceComputer Science (R0)