Resource Contention Aware Execution of Multiprocessor Tasks on Heterogeneous Platforms
In high performance computing (HPC), the tasks of complex applications have to be assigned to the compute nodes of heterogeneous HPC platforms in such a way that the total execution time is minimized. Common approaches, such as task scheduling methods, usually base their decisions on task runtimes that are predicted by cost models. A high accuracy and reliability of these models is crucial for achieving low execution times for all tasks. The individual runtimes of concurrently executed tasks are often affected by contention for hardware resources, such as communication networks, the main memory, or hard disks. However, existing cost models usually ignore the effects of resource contention, thus leading to large deviations between predicted and measured runtimes. In this article, we present a resource contention aware cost model for the execution of multiprocessor tasks on heterogeneous platforms. The integration of the proposed model into two task scheduling methods is described. The cost model is validated in isolation as well as within the utilized scheduling methods. Performance results with different benchmark tasks and with tasks of a complex simulation application are shown to demonstrate the performance improvements achieved by taking the effects of resource contention into account.
KeywordsResource contention Multiprocessor tasks Heterogeneous platforms Scheduling methods Distributed simulations
This work was performed within the Federal Cluster of Excellence EXC 1075 “MERGE Technologies for Multifunctional Lightweight Structures” and supported by the German Research Foundation (DFG).
- 3.Beuchler, S., Meyer, A., Pester, M.: SPC-PM3AdH v1.0 - Programmer’s manual. Preprint SFB/393 01–08, TU-Chemnitz (2001)Google Scholar
- 4.Culler, D., Karp, R., Patterson, D., Sahay, A., Schauser, K., Santos, E., Subramonian, R., von Eicken, T.: LogP: towards a realistic model of parallel computation. In: Proceedings of the 4th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP 1993), pp. 1–12. ACM (1993)Google Scholar
- 5.Dietze, R., Hofmann, M., Rünger, G.: Exploiting heterogeneous compute resources for optimizing lightweight structures. In: Proceedings of the 2nd International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015), pp. 127–134 (2015)Google Scholar
- 7.Dümmler, J., Kunis, R., Rünger, G.: A comparison of scheduling algorithms for multiprocessor tasks with precedence constraints. In: Proceedings of the High Performance Computing & Simulation Conference (HPCS 2007), pp. 663–669. ECMS (2007)Google Scholar
- 9.Fortune, S., Wyllie, J.: Parallelism in random access machines. In: Proceedings of the 10th Annual ACM Symposium on Theory of Computing, pp. 114–118. ACM (1978)Google Scholar
- 11.Skillicorn, D.B., Hill, J., McColl, W.: Questions and answers about BSP. Sci. Program. 6(3), 249–274 (1997)Google Scholar
- 12.Subramanian, L., Seshadri, V., Kim, Y., Jaiyen, B., Mutlu, O.: MISE: providing performance predictability and improving fairness in shared main memory systems. In: Proceedings of the 19th International Symposium on High Performance Computer Architecture (HPCA 2013), pp. 639–650. IEEE (2013)Google Scholar