OpenMP as a High-Level Specification Language for Parallelism
While OpenMP is the de facto standard of shared memory parallel programming models, a number of alternative programming models and runtime systems have arisen in recent years. Fairly evaluating these programming systems can be challenging and can require significant manual effort on the part of researchers. However, it is important to facilitate these comparisons as a way of advancing both the available OpenMP runtimes and the research being done with these novel programming systems.
In this paper we present the OpenMP-to-X framework, an open source tool for mapping OpenMP constructs and APIs to other parallel programming systems. We apply OpenMP-to-X to the HClib parallel programming library, and use it to enable a fair and objective comparison of performance and programmability among HClib, GNU OpenMP, and Intel OpenMP. We use this investigation to expose performance bottlenecks in both the Intel OpenMP and HClib runtimes, to motivate improvements to the HClib programming model and runtime, and to propose potential extensions to the OpenMP standard. Our performance analysis shows that, across a wide range of benchmarks, HClib demonstrates significantly less volatility in its performance with a median standard deviation of 1.03 % in execution times and outperforms the two OpenMP implementations on 15 out of 24 benchmarks.
KeywordsProgramming Model Parallel Programming Work Thread Percent Standard Deviation Parallel Programming Model
This work was supported in part by the Data Analysis and Visualization Cyberinfrastructure funded by NSF under grant OCI-0959097 and Rice University.
The authors would also like to acknowledge the contributions of Vivek Kumar, Nick Vrvilo, and Vincent Cave to the HClib project.
- 1.Clang libtooling. http://clang.llvm.org/docs/LibTooling.html
- 2.Adhianto, L.: Hpctoolkit: tools for performance analysis of optimized parallel programs. Concur. Comput. Pract. Exp. 22, 685–701 (2010)Google Scholar
- 5.Cavé, V., Zhao, J., Shirako, J., Sarkar, V.: Habanero-Java: the new adventures of old X10. In: Proceedings of the 9th International Conference on Principles and Practice of Programming in Java, pp. 51–61. ACM (2011)Google Scholar
- 7.Chatterjee, S., Tasirlar, S., Budimlic, Z., Cave, V., Chabbi, M., Grossman, M., Sarkar, V., Yan, Y.: Integrating asynchronous task parallelism with MPI. In: 2013 IEEE 27th International Symposium on Parallel & Distributed Processing (IPDPS), pp. 712–725. IEEE (2013)Google Scholar
- 9.Eichenberger, A., Mellor-Crummey, J., Schulz, M., Copty, N., DelSignore, J., Dietrich, R., Liu, X., Loh, E., Lorenz, D.: OMPT and OMPD: Openmp tools application programming interfaces for performance analysis and debugging. In: International Workshop on OpenMP (IWOMp 2013) (2013)Google Scholar
- 10.Habanero Research Group: Hclib: a library implementation of the habanero-c language (2013). http://hc.rice.edu
- 12.Hornung, R., Keasler, J.: The raja portability layer: overview and status (2014)Google Scholar
- 13.International Organization for Standardization. The C++ Programming Language Standard (2014). https://isocpp.org/std/the-standard
- 16.Reinders, J.: Intel Threading Building Blocks: Outfitting C++ for Multi-core Processor Parallelism. O’Reilly Media Inc., Sebastopol (2007)Google Scholar