An Evaluation of MPI and OpenMP Paradigms for Multi-Dimensional Data Remapping
We evaluate dynamic data remapping on cluster of SMP architectures under OpenMP, MPI, and hybrid paradigms. Traditional method of multi-dimensional array transpose needs an auxiliary array of the same size and a copy back stage. We recently developed an inplace method using vacancy tracking cycles. The vacancy tracking algorithm outperforms the traditional 2-array method as demonstrated by extensive comparisons. Performance of multi-threaded parallelism using OpenMP are first tested with different scheduling methods and different number of threads. Both methods are then parallelized using several parallel paradigms. At node level, pure OpenMP outperforms pure MPI by a factor of 2.76 for vacancy tracking method. Across entire cluster of SMP nodes, by carefully choosing thread numbers, the hybrid MPI/OpenMP implementation outperforms pure MPI by a factor of 3.79 for traditional method and 4.44 for vacancy tracking method, demonstrating the validity of the parallel paradigm of mixing MPI with OpenMP.
KeywordsDynamical data remapping multidimensional arrays index reshuffle vacancy tracking cycles global exchange MPI OpenMP hybrid MPI/OpenMP SMP cluster
Unable to display preview. Download preview PDF.
- 1.OpenMP: Simple, Portable, Scalable SMP Programming. http://www.openmp.org
- 2.COMPunity-The Community for OpenMP Users. http://www.comunity.org
- 3.L. Smith and M. Bull, Development of hybrid mode MPI/OpenMP applications, Scientific Programming, Vol. 9, No 2–3, 83–98, 2001.Google Scholar
- 4.F. Cappello and D. Etiemble, “MPI versus MPI+OpenMP on IBM SP for the NAS Benchmarks”, SC 2000, Dallas, Texas, Nov 4–10, 2000.Google Scholar
- 5.P. Lanucara and S. Rovida, “Conjugate-Gradient Algorithms: An MPI Open-MP Implementation on Distributed Shared Memory Systems”. EWOMP 1999. Lund University, Sweden, Sept.30–Oct.1, 1999.Google Scholar
- 6.A. Kneer, “Industrial Hybrid OpenMP/MPI CFD application for Practical Use in Free-surface Flow Calculations”, WOMPAT 2000: Workshop on OpenMP Applications and Tools, San Diego, July 6–7, 2000.Google Scholar
- 7.D. S. Henty,“Performance of Hybrid Message-Passing and Shared-Memory Parallelism for Discrete Element Modeling”, SC 2000, Dallas, Texas, Nov 4–10, 2000.Google Scholar
- 10.C. H.Q. Ding and Y. He, “Data Organization and I/O in a parallel ocean circulation model”, Lawrence Berkeley National Lab Tech Report 43384. Proceedings of Supercomputing’99, Nov 1999.Google Scholar
- 12.Y. He and C. H.Q. Ding, “MPI and OpenMP Paradiagms on Cluster of SMP Architectures: the Vacancy Tracking Algorithm for Multi-Dimensional Array Transpose”, SC2002, Baltimore, Maryland, Nov 15–19, 2002.Google Scholar
- 14.S. H. Bokhari. “Complete Exchange on the Intel iPSC-860 hypercube”, Technical Report 91-4, ICASE, 1991.Google Scholar
- 15.Using OpenMP on seaborg. http://hpcf.nersc.gov/computers/SP/openmp.html