Analysis of Cray XC30 Performance Using Trinity-NERSC-8 Benchmarks and Comparison with Cray XE6 and IBM BG/Q

  • M. J. CorderyEmail author
  • Brian Austin
  • H. J. Wassermann
  • C. S. Daley
  • N. J. Wright
  • S. D. Hammond
  • D. Doerfler
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8551)


In this paper, we examine the performance of a suite of applications on three different architectures: Edison, a Cray XC30 with Intel Ivy Bridge processors; Hopper and Cielo, both Cray XE6’s with AMD Magny–Cours processors; and Mira, an IBM BlueGene/Q with PowerPC A2 processors. The applications chosen are a subset of the applications used in a joint procurement effort between Lawrence Berkeley National Laboratory, Los Alamos National Laboratory and Sandia National Laboratories. Strong scaling results are presented, using both MPI-only and MPI+OpenMP execution models.


Benchmarking HPC Performance 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Geist, G.A.: Sustained petascale: The next MPI challenge. In: Cappello, F., Herault, T., Dongarra, J. (eds.) PVM/MPI 2007. LNCS, vol. 4757, pp. 3–4. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  2. 2.
    Challenges for the message passing interface in the petaflops era,
  3. 3.
    Bauer, B., Gottlieb, S., Hoefler, T.: Performance modeling and comparative analysis of the MILC Lattice QCD application su3_rmd. In: Proc. CCGRID 2012: IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing (2012)Google Scholar
  4. 4.
    Gottlieb, S., Tamhankar, S.: Benchmarking MILC with OpenMP and MPI. Nucl. Phys. Proc. Suppl. 94, 841–845 (2001)CrossRefGoogle Scholar
  5. 5.
    Ethier, S., Tang, W.M., Lin, Z.: Gyrokinetic particle-in-cell simulations of plasma microturbulence on advanced computing platforms. Journal of Physics: Conference Series 16, 1–15 (2006)Google Scholar
  6. 6.
    Fryxell, B., Olson, K., Ricker, P., Timmes, F.X., Zingale, M., Lamb, D.Q., MacNeice, P., Rosner, R., Truran, J.W., Tufo, H.: FLASH: An Adaptive Mesh Hydrodynamics Code for Modeling Astrophysical Thermonuclear Flashes. The Astrophysical Journal Supplement Series 131(1), 273 (2000)CrossRefGoogle Scholar
  7. 7.
    The Flash Center for Computational Science, University of Chicago. FLASH User’s Guide. Version 4.0 (September 2012),
  8. 8.
    Antypas, K., Calder, A., Dubey, A., Fisher, R.T., Ganapathy, M.K., Gallagher, B., Reid, L.B., Riley, K., Sheeler, D.J., Taylor, N.: Scientific Applications on the Massively Parallel BG/L Machine. In: PDPTA, vol. 2006, pp. 292–298 (2006)Google Scholar
  9. 9.
    Heroux, M.A., et al.: Improving Performance via Mini-applications. Technical Report SAND2009-5574, Sandia National Laboratories (September 2009),
  10. 10.
    Heroux, M.A.: Mantevo project web page,
  11. 11.
    Barrett, R.F., Crozier, P.S., Doerfler, D.W., Hammond, S.D., Heroux, M.A., Thornquist, H.K. Trucano, T.G., Vaughan, C.T.: Summary of work for asc l2 milestone 4465: Characterize the role of the mini-application in predicting key performance characteristics of real applications. Sandia National Laboratories, Tech. Rep. SAND, 4667 (2012)Google Scholar
  12. 12.
    Williams, S.W., Waterman, A., Patterson, D.A.: Roofline: An insightful visual performance model for floating-point programs and multicore architectures. Technical Report UCB/EECS-2008-134, EECS Department, University of California, Berkeley (October 2008)Google Scholar
  13. 13.
    Antypas, K., Shalf, J., Wasserman, H.: NERSC-6 Workload Analysis and Benchmark Selection Process. Technical Report LBNL 10143, Lawrence Berkeley National Laboratory (2008)Google Scholar
  14. 14.
    Kerbyson, D.J., Barker, K.J., Vishnu, A., Hoisie, A.: Comparing the performance of Blue Gene/Q with leading Cray XE6 and InfiniBand systems. In: Proceedings of the 2012 IEEE 18th International Conference on Parallel and Distributed Systems, ICPADS 2012, pp. 556–563. IEEE Computer Society, Washington, DC (2012)Google Scholar
  15. 15.
    Oliker, L.: Personal communication (2013)Google Scholar
  16. 16.
    Joó, B., Kalamkar, D.D., Vaidyanathan, K., Smelyanskiy, M., Pamnany, K., Lee, V.W., Dubey, P., Watson III, W.: Lattice QCD on intel\({\textregistered }\)xeon phi\(^{\rm {TM}}\) coprocessors. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2013. LNCS, vol. 7905, pp. 40–54. Springer, Heidelberg (2013)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • M. J. Cordery
    • 1
    Email author
  • Brian Austin
    • 1
  • H. J. Wassermann
    • 1
  • C. S. Daley
    • 1
  • N. J. Wright
    • 1
  • S. D. Hammond
    • 2
  • D. Doerfler
    • 2
  1. 1.NERSCLawrence Berkeley National LaboratoryBerkeleyUSA
  2. 2.Center for Computing ResearchSandia National Laboratories AlbuquerqueAlbuquerqueUSA

Personalised recommendations