An Evaluation of MPI and OpenMP Paradigms for Multi-Dimensional Data Remapping

  • Yun He
  • Chris H. Q. Ding
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2716)


We evaluate dynamic data remapping on cluster of SMP architectures under OpenMP, MPI, and hybrid paradigms. Traditional method of multi-dimensional array transpose needs an auxiliary array of the same size and a copy back stage. We recently developed an inplace method using vacancy tracking cycles. The vacancy tracking algorithm outperforms the traditional 2-array method as demonstrated by extensive comparisons. Performance of multi-threaded parallelism using OpenMP are first tested with different scheduling methods and different number of threads. Both methods are then parallelized using several parallel paradigms. At node level, pure OpenMP outperforms pure MPI by a factor of 2.76 for vacancy tracking method. Across entire cluster of SMP nodes, by carefully choosing thread numbers, the hybrid MPI/OpenMP implementation outperforms pure MPI by a factor of 3.79 for traditional method and 4.44 for vacancy tracking method, demonstrating the validity of the parallel paradigm of mixing MPI with OpenMP.


Dynamical data remapping multidimensional arrays index reshuffle vacancy tracking cycles global exchange MPI OpenMP hybrid MPI/OpenMP SMP cluster 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    OpenMP: Simple, Portable, Scalable SMP Programming.
  2. 2.
    COMPunity-The Community for OpenMP Users.
  3. 3.
    L. Smith and M. Bull, Development of hybrid mode MPI/OpenMP applications, Scientific Programming, Vol. 9, No 2–3, 83–98, 2001.Google Scholar
  4. 4.
    F. Cappello and D. Etiemble, “MPI versus MPI+OpenMP on IBM SP for the NAS Benchmarks”, SC 2000, Dallas, Texas, Nov 4–10, 2000.Google Scholar
  5. 5.
    P. Lanucara and S. Rovida, “Conjugate-Gradient Algorithms: An MPI Open-MP Implementation on Distributed Shared Memory Systems”. EWOMP 1999. Lund University, Sweden, Sept.30–Oct.1, 1999.Google Scholar
  6. 6.
    A. Kneer, “Industrial Hybrid OpenMP/MPI CFD application for Practical Use in Free-surface Flow Calculations”, WOMPAT 2000: Workshop on OpenMP Applications and Tools, San Diego, July 6–7, 2000.Google Scholar
  7. 7.
    D. S. Henty,“Performance of Hybrid Message-Passing and Shared-Memory Parallelism for Discrete Element Modeling”, SC 2000, Dallas, Texas, Nov 4–10, 2000.Google Scholar
  8. 8.
    J. Drake, I. Foster, J. Michalakes, B. Toonen and P. Worley, “Design and performance of a scalable parallel community climate model”, Parallel Computing, v.21, pp.1571–1581, 1995.zbMATHCrossRefGoogle Scholar
  9. 9.
    I. T. Foster and P. H. Worley. “Parallel algorithms for the spectral transform method,” SIAM J. Sci. Stat. Comput., v.18, pp. 806–837. 1997.zbMATHCrossRefMathSciNetGoogle Scholar
  10. 10.
    C. H.Q. Ding and Y. He, “Data Organization and I/O in a parallel ocean circulation model”, Lawrence Berkeley National Lab Tech Report 43384. Proceedings of Supercomputing’99, Nov 1999.Google Scholar
  11. 11.
    C. H.Q. Ding, “An Optimal Index Reshuffle Algorithm for Multidimensional Arrays and Its Applications for Parallel Architectures”, IEEE Transactions on Parallel and Distributed Systems, V.12, No.3, pp.306–315, 2001.CrossRefGoogle Scholar
  12. 12.
    Y. He and C. H.Q. Ding, “MPI and OpenMP Paradiagms on Cluster of SMP Architectures: the Vacancy Tracking Algorithm for Multi-Dimensional Array Transpose”, SC2002, Baltimore, Maryland, Nov 15–19, 2002.Google Scholar
  13. 13.
    S. L. Johnsson and C.-T. Ho, “Matrix transposition on boolen n-cube configured ensemble architectures”, SIAM J. Matrix Anal. Appl. v.9. pp.419–454, 1988.zbMATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    S. H. Bokhari. “Complete Exchange on the Intel iPSC-860 hypercube”, Technical Report 91-4, ICASE, 1991.Google Scholar
  15. 15.

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Yun He
    • 1
  • Chris H. Q. Ding
    • 1
  1. 1.CRD DivisionLawrence Berkeley National LaboratoryBerkeleyUSA

Personalised recommendations