An Open MPI Extension for Supporting Task Based Parallelism in Heterogeneous CPU-GPU Clusters

Cabello, Uriel; Rodríguez, José; Meneses-Viveros, Amilcar

doi:10.1007/978-3-319-32243-8_10

Uriel Cabello¹²,
José Rodríguez¹² &
Amilcar Meneses-Viveros¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 595))

Included in the following conference series:

International Conference on Supercomputing in Mexico

925 Accesses

Abstract

In this work we identify and analyze some of the patterns appearing in the development and deployment of scientific applications over clusters equipped with heterogeneous computing resources.

The main contributions of this work are the identification of the patterns aforementioned, as well as the design and implementation of an Open MPI extension that supports the development and deployment of applications programmed using a task approach.

In order to illustrate how to use our extension, we provide the implementation and performance evaluation of two sample applications: the N-Body problem and the general matrix multiplication.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Huang, C., Lawlor, O., Kalé, L.V.: Adaptive MPI. In: Rauchwerger, L. (ed.) LCPC 2003. LNCS, vol. 2958, pp. 306–322. Springer, Heidelberg (2004)
Chapter Google Scholar
Karonis, N.T., Toonen, B., Foster, I.: Mpich-g2: A grid-enabled implementation of the message passing interface (2002)
Google Scholar
Song, F., Dongarra, J.: A scalable framework for heterogeneous gpu-based clusters. In: Proceedings of the Twenty-Fourth Annual ACM Symposium on Parallelism in Algorithms and Architectures, SPAA 2012, pp. 91–100. ACM, New York (2012)
Google Scholar
Kim, J., Seo, S., Lee, J., Nah, J., Jo, G., Lee, J.: Snucl: an opencl framework for heterogeneous cpu/gpu clusters. In: Proceedings of the 26th ACM International Conference on Supercomputing, ICS 2012, pp. 341–352. ACM, New York (2012)
Google Scholar
Kegel, P., Steuwer, M., Gorlatch, S.: dopencl: towards a uniform programming approach for distributed heterogeneous multi-/many-core systems. In: Parallel and Distributed Processing Symposium Workshops PhD Forum (IPDPSW), 2012 IEEE 26th International, pp. 174–186 (2012)
Google Scholar
Aoki, R., Oikawa, S., Tsuchiyama, R., Nakamura, T.: Hybrid opencl: connecting different opencl implementations over network. In: 2010 IEEE 10th International Conference on Computer and Information Technology (CIT), pp. 2729–2735, June 2010
Google Scholar
Barak, A., Ben-Nun, T., Levy, E., Shiloh, A.: A package for opencl based heterogeneous computing on clusters with many gpu devices. In: 2010 IEEE International Conference on Cluster Computing Workshops and Posters (CLUSTER WORKSHOPS), pp. 1–7, September 2010
Google Scholar
Alves, A., Rufino, J., Pina, A., Santos, L.P.: clOpenCL - supporting distributed heterogeneous computing in HPC clusters. In: Caragiannis, I., et al. (eds.) Euro-Par Workshops 2012. LNCS, vol. 7640, pp. 112–122. Springer, Heidelberg (2013)
Chapter Google Scholar
The MPI Forum. MPI: A Message-Passing Interface Standard, 10 2012. Ver. 3.0
Google Scholar
Sun, E., Schaa, D., Bagley, R., Rubin, N., Kaeli, D.: Enabling task-level scheduling on heterogeneous platforms. In: Proceedings of the 5th Annual Workshop on General Purpose Processing with Graphics Processing Units, GPGPU-5, pp. 84–93. ACM, New York (2012)
Google Scholar
Denis, A., Pérez, C., Priol, T.: Towards high performance CORBA and MPI middlewares for grid computing. In: Lee, C.A. (ed.) GRID 2001. LNCS, vol. 2242, pp. 14–25. Springer, Heidelberg (2001)
Chapter Google Scholar
Seymour, K., Nakada, H., Matsuoka, S., Dongarra, J., Lee, C., Casanova, H.: Gridrpc: A remote procedure call api for grid computing (2002)
Google Scholar
Foster, I.: Globus toolkit version 4: software for service-oriented systems. In: Jin, H., Reed, D., Jiang, W. (eds.) NPC 2005. LNCS, vol. 3779, pp. 2–13. Springer, Heidelberg (2005)
Chapter Google Scholar
Vadhiyar, S.S., Dongarra, J.J.: Gradsolvea grid-based RPC system for parallel computing with application-level scheduling. J. Parallel Distrib. Comput. 64(6), 774–783 (2004). YJPDC Special Issue on Middleware
Article Google Scholar
Cybenko, G.: Dynamic load balancing for distributed memory multiprocessors. J. Parallel Distrib Comput. 7(2), 279–301 (1989)
Article Google Scholar
Barak, A., Margolin, A., Shiloh, A.: Automatic resource-centric process migration for MPI. In: Träff, J.L., Benkner, S., Dongarra, J.J. (eds.) EuroMPI 2012. LNCS, vol. 7490, pp. 163–172. Springer, Heidelberg (2012)
Chapter Google Scholar
Bhatelé, A., Kalé, L.V., Kumar, S.: Dynamic topology aware load balancing algorithms for molecular dynamics applications. In: Proceedings of the 23rd International Conference on Supercomputing, ICS 2009, pp. 110–116. ACM, New York (2009)
Google Scholar
Hu, Y.F., Blake, R.J., Emerson, D.R.: An optimal migration algorithm for dynamic load balancing. Concurrency Pract. Experience 10(6), 467–483 (1998)
Article MATH Google Scholar
Li, Y., Yang, Y., Ma, M., Zhou, L.: A hybrid load balancing strategy of sequential tasks for grid computing environments. Future Gener. Comput. Syst. 25(8), 819–828 (2009)
Article Google Scholar
Li, Y., Lan, Z.: A survey of load balancing in grid computing. In: Zhang, J., He, J.-H., Fu, Y. (eds.) CIS 2004. LNCS, vol. 3314, pp. 280–285. Springer, Heidelberg (2004)
Chapter Google Scholar
Ravi, V.T., Ma, W., Chiu, D., Agrawal, G.: Compiler and runtime support for enabling generalized reduction computations on heterogeneous parallel configurations. In: Proceedings of the 24th ACM International Conference on Supercomputing, ICS 2010, pp. 137–146. ACM, New York (2010)
Google Scholar
Beltrn, M., Guzmn, A.: How to balance the load on heterogeneous clusters. Int. J. High Perform. Comput. Appl. 23(1), 99–118 (2009)
Article Google Scholar
Boveiri, H.R.: Aco-mts: a new approach for multiprocessor task scheduling based on ant colony optimization. In: 2010 International Conference on Intelligent and Advanced Systems (ICIAS), pp. 1–5 (2010)
Google Scholar
Willebeek-LeMair, M.H., Reeves, A.P.: Strategies for dynamic load balancing on highly parallel computers. IEEE Trans. Parallel Distrib. Syst. 4(9), 979–993 (1993)
Article Google Scholar
Romdhanne, B.B., Nikaein, N., Bonnet, C.: Coordinator-master-worker model for efficient large scale network simulation. In: Proceedings of the 6th International ICST Conference on Simulation Tools and Techniques, SimuTools 2013, ICST, Brussels, Belgium, Belgium, ICST (Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering), pp. 119–128 (2013)
Google Scholar
Brown, J.A., Porter, L., Tullsen, D.M.: Fast thread migration via cache working set prediction. In: 2011 IEEE 17th International Symposium on High Performance Computer Architecture (HPCA), pp. 193–204 (2011)
Google Scholar
Shirahata, K., Sato, H., Matsuoka, S.: Hybrid map task scheduling for gpu-based heterogeneous clusters. In: 2010 IEEE Second International Conference on Cloud Computing Technology and Science (CloudCom), pp. 733–740 (2010)
Google Scholar
Acosta, A., Blanco, V., Almeida, F.: Towards the dynamic load balancing on heterogeneous multi-gpu systems. In: 2012 IEEE 10th International Symposium on Parallel and Distributed Processing with Applications (ISPA), pp. 646–653 (2012)
Google Scholar
Milojičić, D.S., Douglis, F., Paindaveine, Y., Wheeler, R., Zhou, S.: Process migration. ACM Comput. Surv. 32(3), 241–299 (2000)
Article Google Scholar
The Khronos Group. The OpenCL specification, 11 2012. Ver. 1.2
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Center of Research and Advanced Studies (Cinvestav), Mexico City, Mexico
Uriel Cabello, José Rodríguez & Amilcar Meneses-Viveros

Authors

Uriel Cabello
View author publications
You can also search for this author in PubMed Google Scholar
José Rodríguez
View author publications
You can also search for this author in PubMed Google Scholar
Amilcar Meneses-Viveros
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Uriel Cabello .

Editor information

Editors and Affiliations

ABACUS Centro de Matemáticas Aplicadas, CINVESTAV-IPN, La Marquesa, Mexico
Isidoro Gitler
Instituto Nacional de Investigaciones Nu, La Marquesa, Mexico
Jaime Klapp

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cabello, U., Rodríguez, J., Meneses-Viveros, A. (2016). An Open MPI Extension for Supporting Task Based Parallelism in Heterogeneous CPU-GPU Clusters. In: Gitler, I., Klapp, J. (eds) High Performance Computer Applications. ISUM 2015. Communications in Computer and Information Science, vol 595. Springer, Cham. https://doi.org/10.1007/978-3-319-32243-8_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-32243-8_10
Published: 08 April 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-32242-1
Online ISBN: 978-3-319-32243-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics