Performance Evaluation and Scalability Analysis of NPB-MZ on Intel Xeon Phi Coprocessor

  • Yuqian Li
  • Yonggang Che
  • Zhenghua Wang
Part of the Communications in Computer and Information Science book series (CCIS, volume 396)


Intel Many Integrated Cores (Intel MIC) is a novel architecture for high performance computing (HPC). It features large thread parallelism and wide vector processing units, targeting highly parallel applications. The HPC communities are faced with the problem of porting their applications to the MIC platforms. But it is still an open question that how current HPC applications can exploit the capabilities of MIC. This paper evaluates the performance of NPB-MZ programs which are derived from real world Computational Fluid Dynamics (CFD) applications on Intel Xeon Phi coprocessor, the first MIC product. The strong scaling behaviors of the applications with different process-thread combinations are investigated. The performance obtained on the Intel Xeon Phi coprocessors is compared against that obtained on Sandy Bridge CPU based computer nodes. The results show that these programs can achieve good parallel scalability when running with appropriate combinations of processes and threads. But their absolute performance on Intel Xeon Phi coprocessor is significantly lower than that on CPU node, due primarily to the much lower single thread performance. The findings of this paper are of help to the performance optimization of other applications on MIC.


Intel MIC NPB-MZ performance evaluation scalability single thread performance 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Heinecke, A., Klemm, M., Bungartz, H.: From GPGPU to Many-Core: Nvidia Fermi and Intel Many Integrated Core Architecture. Computing in Science & Engineering 14(2), 78–83 (2012)CrossRefGoogle Scholar
  2. 2.
    Top500 supercomputer sites (December 2012),
  3. 3.
    NASA Advanced Supercomputing Division,
  4. 4.
    Intel: Intel Xeon Phi Coprocessor System Software Development Guide (2012)Google Scholar
  5. 5.
    Van der, W., Jin, H.: Nas parallel benchmarks, multi-zone versions. NASA Ames Research Center, Tech. Rep. NAS-03-010 (2003) Google Scholar
  6. 6.
    Intel@ Xeon PhiTM Coprocessor,
  7. 7.
    Jeffers, J., Reinders, J.: Intel Xeon Phi Coprocessor High Performance Programming. Morgan Kaufmann (2013)Google Scholar
  8. 8.
    Jin, H., Van der, W.: Performance characteristics of the multi-zone NAS parallel benchmarks. J. Parallel and Distributed Computing 66(5), 674–685(2006)Google Scholar
  9. 9.
    Newburn, C.J., Deodhar, R., Dmitriev, S.: Offload Compiler Runtime for the Intel® Xeon PhiTM CoprocessorGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Yuqian Li
    • 1
  • Yonggang Che
    • 1
  • Zhenghua Wang
    • 1
  1. 1.National Laboratory of Parallel and Distributed ProcessingNational University of Defense TechnologyChangshaChina

Personalised recommendations