Exploiting and Evaluating OpenSHMEM on KNL Architecture

Hashmi, Jahanzeb Maqbool; Li, Mingzhe; Subramoni, Hari; Panda, Dhabaleswar K.

doi:10.1007/978-3-319-73814-7_10

Jahanzeb Maqbool Hashmi¹⁶,
Mingzhe Li¹⁶,
Hari Subramoni¹⁶ &
…
Dhabaleswar K. Panda¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10679))

Included in the following conference series:

Workshop on OpenSHMEM and Related Technologies

473 Accesses

Abstract

Manycore processors such as Intel Xeon Phi (KNL) with on-package Multi-Channel DRAM (MCDRAM) are making a paradigm shift in the High Performance Computing (HPC) industry. PGAS programming models such as OpenSHMEM due to its lightweight synchronization primitives and shared memory abstractions are considered a good fit for irregular communication patterns. While regular programming models such as MPI/OpenMP have started utilizing systems with KNL processors, it is still not clear whether PGAS models can easily adopt and fully utilize such systems. In this paper, we conduct a comprehensive performance evaluation of the OpenSHMEM runtime on many-/multi-core processors. We also explore the performance benefits offered by the highly multithreaded KNL along with the AVX-512 extensions and MCDRAM for OpenSHMEM programming model. We evaluate Intra- and Inter-node performance of OpenSHMEM primitives on different application kernels. Our evaluation of application kernels such as NAS Parallel Benchmark and 3D-Stencil kernels show that OpenSHMEM with MVPAICH2-X runtime is able to take advantage of AVX-512 extensions and MCDRAM to exploit the architectural features provided by KNL processors.

This research is supported in part by National Science Foundation grants #CNS-1419123, #CNS-1513120, #ACI-1450440 and #CCF-1565414.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 60.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

OSU Micro-Benchmarks (2015)
Google Scholar
TACC Stampede KNL Cluster (2017). https://portal.tacc.utexas.edu/user-guides/stampede
Barnes, T., Cook, B., Deslippe, J., Doerfler, D., Friesen, B., He, Y., Kurth, T., Koskela, T., Lobet, M., Malas, T., et al.: Evaluating and optimizing the NERSC workload on knights landing. In: International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS), pp. 43–53. IEEE (2016)
Google Scholar
Cantalupo, C., Venkatesan, V., Hammond, J., Czurlyo, K., Hammond, S.D.: Memkind: An Extensible Heap Memory Manager for Heterogeneous Memory Platforms and Mixed Memory Policies. Technical report, Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States) (2015)
Google Scholar
Cong, G., Almasi, G., Saraswat, V.: Fast PGAS implementation of distributed graph algorithms. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2010, pp. 1–11. IEEE Computer Society, Washington, DC (2010)
Google Scholar
Doerfler, D., Deslippe, J., Williams, S., Oliker, L., Cook, B., Kurth, T., Lobet, M., Malas, T., Vay, J.-L., Vincenti, H.: Applying the roofline performance model to the intel xeon phi knights landing processor. In: Intel Xeon Phi User’s Group (IXPUG 2016) (2016)
Google Scholar
Kandalla, K., Mendygral, P., Radcliffe, N., Cernohous, B., Knaak, D., McMahon, K., Pagel, M.: Optimizing Cray MPI and SHMEM Software Stacks for Cray-XC Supercomputers based on Intel KNL Processors (2016)
Google Scholar
Lin, J., Hamidouche, K., Zhang, J., Lu, X., Vishnu, A., Panda, D.: Accelerating k-NN algorithm with hybrid MPI and OpenSHMEM. In: Gorentla Venkata, M., Shamis, P., Imam, N., Lopez, M.G. (eds.) OpenSHMEM 2014. LNCS, vol. 9397, pp. 164–177. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-26428-8_11
Chapter Google Scholar
Memory Latency on the Intel Xeon Phi x200 Knights Landing processor. https://sites.utexas.edu/jdm4372/2016/12/06/memory-latency-on-the-intel-xeon-phi-x200-knights-landing-processor/
Potluri, S., Venkatesh, A., Bureddy, D., Kandalla, K., Panda, D.K.: Efficient intra-node communication on intel-MIC clusters. In: 13th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2013) (2013)
Google Scholar
Zhang, J., Behzad, B., Snir, M.: Optimizing the Barnes-Hut algorithm in UPC. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2011, pp. 75:1–75:11. ACM, New York (2011)
Google Scholar
Zhao, Z., Marsman, M.: Estimating the performance impact of the MCDRAM on KNL using dual-socket Ivy bridge nodes on Cray XC30. In: Cray User Group Meeting (CUG 2016) (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, The Ohio State University, Columbus, USA
Jahanzeb Maqbool Hashmi, Mingzhe Li, Hari Subramoni & Dhabaleswar K. Panda

Authors

Jahanzeb Maqbool Hashmi
View author publications
You can also search for this author in PubMed Google Scholar
Mingzhe Li
View author publications
You can also search for this author in PubMed Google Scholar
Hari Subramoni
View author publications
You can also search for this author in PubMed Google Scholar
Dhabaleswar K. Panda
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jahanzeb Maqbool Hashmi .

Editor information

Editors and Affiliations

Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
Manjunath Gorentla Venkata
Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
Neena Imam
Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
Swaroop Pophale

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hashmi, J.M., Li, M., Subramoni, H., Panda, D.K. (2018). Exploiting and Evaluating OpenSHMEM on KNL Architecture. In: Gorentla Venkata, M., Imam, N., Pophale, S. (eds) OpenSHMEM and Related Technologies. Big Compute and Big Data Convergence. OpenSHMEM 2017. Lecture Notes in Computer Science(), vol 10679. Springer, Cham. https://doi.org/10.1007/978-3-319-73814-7_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-73814-7_10
Published: 10 January 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73813-0
Online ISBN: 978-3-319-73814-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics