Abstract
In this work we identify and analyze some of the patterns appearing in the development and deployment of scientific applications over clusters equipped with heterogeneous computing resources.
The main contributions of this work are the identification of the patterns aforementioned, as well as the design and implementation of an Open MPI extension that supports the development and deployment of applications programmed using a task approach.
In order to illustrate how to use our extension, we provide the implementation and performance evaluation of two sample applications: the N-Body problem and the general matrix multiplication.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Huang, C., Lawlor, O., Kalé, L.V.: Adaptive MPI. In: Rauchwerger, L. (ed.) LCPC 2003. LNCS, vol. 2958, pp. 306–322. Springer, Heidelberg (2004)
Karonis, N.T., Toonen, B., Foster, I.: Mpich-g2: A grid-enabled implementation of the message passing interface (2002)
Song, F., Dongarra, J.: A scalable framework for heterogeneous gpu-based clusters. In: Proceedings of the Twenty-Fourth Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2012, pp. 91–100. ACM, New York (2012)
Kim, J., Seo, S., Lee, J., Nah, J., Jo, G., Lee, J.: Snucl: an opencl framework for heterogeneous cpu/gpu clusters. In: Proceedings of the 26th ACM International Conference on Supercomputing, ICS 2012, pp. 341–352. ACM, New York (2012)
Kegel, P., Steuwer, M., Gorlatch, S.: dopencl: towards a uniform programming approach for distributed heterogeneous multi-/many-core systems. In: Parallel and Distributed Processing Symposium Workshops PhD Forum (IPDPSW), 2012 IEEE 26th International, pp. 174–186 (2012)
Aoki, R., Oikawa, S., Tsuchiyama, R., Nakamura, T.: Hybrid opencl: connecting different opencl implementations over network. In: 2010 IEEE 10th International Conference on Computer and Information Technology (CIT), pp. 2729–2735, June 2010
Barak, A., Ben-Nun, T., Levy, E., Shiloh, A.: A package for opencl based heterogeneous computing on clusters with many gpu devices. In: 2010 IEEE International Conference on Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS), pp. 1–7, September 2010
Alves, A., Rufino, J., Pina, A., Santos, L.P.: clOpenCL - supporting distributed heterogeneous computing in HPC clusters. In: Caragiannis, I., et al. (eds.) Euro-Par Workshops 2012. LNCS, vol. 7640, pp. 112–122. Springer, Heidelberg (2013)
The MPI Forum. MPI: A Message-Passing Interface Standard, 10 2012. Ver. 3.0
Sun, E., Schaa, D., Bagley, R., Rubin, N., Kaeli, D.: Enabling task-level scheduling on heterogeneous platforms. In: Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units, GPGPU-5, pp. 84–93. ACM, New York (2012)
Denis, A., Pérez, C., Priol, T.: Towards high performance CORBA and MPI middlewares for grid computing. In: Lee, C.A. (ed.) GRID 2001. LNCS, vol. 2242, pp. 14–25. Springer, Heidelberg (2001)
Seymour, K., Nakada, H., Matsuoka, S., Dongarra, J., Lee, C., Casanova, H.: Gridrpc: A remote procedure call api for grid computing (2002)
Foster, I.: Globus toolkit version 4: software for service-oriented systems. In: Jin, H., Reed, D., Jiang, W. (eds.) NPC 2005. LNCS, vol. 3779, pp. 2–13. Springer, Heidelberg (2005)
Vadhiyar, S.S., Dongarra, J.J.: Gradsolvea grid-based RPC system for parallel computing with application-level scheduling. J. Parallel Distrib. Comput. 64(6), 774–783 (2004). YJPDC Special Issue on Middleware
Cybenko, G.: Dynamic load balancing for distributed memory multiprocessors. J. Parallel Distrib Comput. 7(2), 279–301 (1989)
Barak, A., Margolin, A., Shiloh, A.: Automatic resource-centric process migration for MPI. In: Träff, J.L., Benkner, S., Dongarra, J.J. (eds.) EuroMPI 2012. LNCS, vol. 7490, pp. 163–172. Springer, Heidelberg (2012)
Bhatelé, A., Kalé, L.V., Kumar, S.: Dynamic topology aware load balancing algorithms for molecular dynamics applications. In: Proceedings of the 23rd International Conference on Supercomputing, ICS 2009, pp. 110–116. ACM, New York (2009)
Hu, Y.F., Blake, R.J., Emerson, D.R.: An optimal migration algorithm for dynamic load balancing. Concurrency Pract. Experience 10(6), 467–483 (1998)
Li, Y., Yang, Y., Ma, M., Zhou, L.: A hybrid load balancing strategy of sequential tasks for grid computing environments. Future Gener. Comput. Syst. 25(8), 819–828 (2009)
Li, Y., Lan, Z.: A survey of load balancing in grid computing. In: Zhang, J., He, J.-H., Fu, Y. (eds.) CIS 2004. LNCS, vol. 3314, pp. 280–285. Springer, Heidelberg (2004)
Ravi, V.T., Ma, W., Chiu, D., Agrawal, G.: Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations. In: Proceedings of the 24th ACM International Conference on Supercomputing, ICS 2010, pp. 137–146. ACM, New York (2010)
Beltrn, M., Guzmn, A.: How to balance the load on heterogeneous clusters. Int. J. High Perform. Comput. Appl. 23(1), 99–118 (2009)
Boveiri, H.R.: Aco-mts: a new approach for multiprocessor task scheduling based on ant colony optimization. In: 2010 International Conference on Intelligent and Advanced Systems (ICIAS), pp. 1–5 (2010)
Willebeek-LeMair, M.H., Reeves, A.P.: Strategies for dynamic load balancing on highly parallel computers. IEEE Trans. Parallel Distrib. Syst. 4(9), 979–993 (1993)
Romdhanne, B.B., Nikaein, N., Bonnet, C.: Coordinator-master-worker model for efficient large scale network simulation. In: Proceedings of the 6th International ICST Conference on Simulation Tools and Techniques, SimuTools 2013, ICST, Brussels, Belgium, Belgium, ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), pp. 119–128 (2013)
Brown, J.A., Porter, L., Tullsen, D.M.: Fast thread migration via cache working set prediction. In: 2011 IEEE 17th International Symposium on High Performance Computer Architecture (HPCA), pp. 193–204 (2011)
Shirahata, K., Sato, H., Matsuoka, S.: Hybrid map task scheduling for gpu-based heterogeneous clusters. In: 2010 IEEE Second International Conference on Cloud Computing Technology and Science (CloudCom), pp. 733–740 (2010)
Acosta, A., Blanco, V., Almeida, F.: Towards the dynamic load balancing on heterogeneous multi-gpu systems. In: 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications (ISPA), pp. 646–653 (2012)
Milojičić, D.S., Douglis, F., Paindaveine, Y., Wheeler, R., Zhou, S.: Process migration. ACM Comput. Surv. 32(3), 241–299 (2000)
The Khronos Group. The OpenCL specification, 11 2012. Ver. 1.2
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Cabello, U., Rodríguez, J., Meneses-Viveros, A. (2016). An Open MPI Extension for Supporting Task Based Parallelism in Heterogeneous CPU-GPU Clusters. In: Gitler, I., Klapp, J. (eds) High Performance Computer Applications. ISUM 2015. Communications in Computer and Information Science, vol 595. Springer, Cham. https://doi.org/10.1007/978-3-319-32243-8_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-32243-8_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-32242-1
Online ISBN: 978-3-319-32243-8
eBook Packages: Computer ScienceComputer Science (R0)