Skip to main content

The Performance Wall of Large Parallel Computing Systems

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 68))

Abstract

Since the beginning of using parallelized computing units it is known that the actual performance is less than the possible nominal performance: the unproductive part of the computing performance remains “dark”. It is also known that the amount of dark performance strongly depends on the number of parallelly working processing units, so its role must be crucial for supercomputers, where in the coming exa-scale models millions of processors are utilized, as well as for the exa-scale applications they are running, like brain simulation and Earth simulation. Although the effects affecting parallel performance are known from the beginning, their relative weights have been considerably changed with the development of the field and strongly depend on the type of application. For large computer systems the “dark performance” represents a new major obstacle, in addition the former ones, like “heat wall”, “memory wall”, “dark silicon”, etc. The careful reconsideration discovers that in contrast with the general belief, supercomputer performance has an upper limit, and reaching that limit explains some strange and mysterious events, like canceling projects immediately before their target date or that the special-purpose brain simulator cannot outperform the many-thread simulator running on a general-purpose supercomputer.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Dongarra, J.: The Global Race for Exascale High Performance Computing (2017). http://ec.europa.eu/newsroom/document.cfm?doc_id=45647

  2. Bourzac, K.: Streching supercomputers to the limit. Nature 551, 554–556 (2017)

    Google Scholar 

  3. Esmaeilzadeh, H.: Approximate acceleration: a path through the era of dark silicon and big data. In: Proceedings of the 2015 International Conference on Compilers, Architecture and Synthesis for Embedded Systems, CASES 2015, pp. 31–32

    Google Scholar 

  4. Fuller, S.H., Millett, L.I.: Computing performance: game over or next level? Computer 44, 31–38 (2011)

    Google Scholar 

  5. Fu, H., Liao, J., Yang, J., Wang, L., Song, Z., Huang, X., Yang, C., Xue, W., Liu, F., Qiao, F., Zhao, W., Yin, X., Hou, C., Zhang, C., Ge, W., Zhang, J., Wang, Y., Zhou, C., Yang, G.: The Sunway TaihuLight supercomputer: system and applications. Sci. China Inf. Sci. 59(7), 1–16 (2016)

    Google Scholar 

  6. Végh, J.: Limitations of performance of exascale applications and supercomputers they are running on. ArXiv e-prints; Submitted to Special Issue of IEEE J. Parallel Distrib. Comput. (2018)

    Google Scholar 

  7. Top500.org: Retooled Aurora Supercomputer will be Americas First Exascale System (2017). https://www.top500.org/news/retooled-aurora-supercomputer-will-be-americas-first-exascale-system/

  8. Inside HPC: Is Aurora Morphing into an Exascale AI Supercomputer? (2017). https://insidehpc.com/2017/06/told-aurora-morphing-novel-architecture-ai-supercomputer/

  9. European Commission: Implementation of the Action Plan for the European High-Performance Computing strategy (2016). http://ec.europa.eu/newsroom/dae/document.cfm?docid=15269

  10. US DOE: The Opportunities and Challenges of Exascale Computing (2010). https://science.energy.gov/~/media/ascr/ascac/pdf/reports/Exascale_subcommittee_report.pdf

  11. van Albada, S.J., Rowley, A.G., Senk, J., Hopkins, M., Schmidt, M., Stokes, A.B., et al.: Performance comparison of the digital neuromorphic hardware SpiNNaker and the neural network simulation software NEST for a full-scale cortical microcircuit model. Front. Neurosci. 12, 291 (2018). https://doi.org/10.3389/fnins.2018.00291

    Google Scholar 

  12. Paul, J.M., Meyer, B.H.: Amdahl’s law revisited for single chip systems. Int. J. Parallel Prog. 35(2), 101–123 (2007)

    Google Scholar 

  13. Amdahl, G. M.: Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the April 18-20 1967 spring joint computer conference on - AFIPS 67 (Spring) AFIPS 67 (Spring) (1967)

    Google Scholar 

  14. www.supercomputersnotes.info

  15. Végh, J., Molnár, P.: How to measure perfectness of parallelization in hardware/software systems. In: 18th International Carpathian Control Conference on ICCC, pp. 394–399 (2017)

    Google Scholar 

  16. Amdahl, G.M.: Validity of the single processor approach to achieving large-scale computing capabilities. In: AFIPS Conference Proceedings, vol. 30, pp. 483–485 (1967)

    Google Scholar 

  17. Karp, A.H., Flatt, H.P.: Measuring parallel processor performance. Commun. ACM 33(5), 539–543 (1990)

    Google Scholar 

  18. TOP500: November 2017 List of Supercomputers (2017). https://www.top500.org/lists/2017/11/

  19. Ellen, F., Hendler, D., Shavit, N.: On the inherent sequentiality of concurrent objects. SIAM J. Comput. 43(3), 519536 (2012). https://doi.org/10.1137/08072646X

    Google Scholar 

  20. Yavits, L., Morad, A., Ginosar, R.: The effect of communication and synchronization on Amdahl’s law in multicore systems. Parallel Comput. 40(1), 1–16 (2014)

    Google Scholar 

  21. Denning, P.J., Lewis, T.: Exponential laws of computing growth. Commun. ACM 54–65 (2017)

    Google Scholar 

  22. Dongarra, J.: Report on the Sunway Taihulight system. Technical report UT-EECS-16-742 (2016)

    Google Scholar 

  23. Tsafrir, D.: The context-switch overhead inflicted by hardware interrupts (and the enigma of do-nothing loops). In: Proceedings of the 2007 Workshop on Experimental Computer Science, ExpCS 2007, p. 3. ACM, New York (2007). http://doi.acm.org/10.1145/1281700.1281704

  24. Zheng, Fang, et al.: Cooperative computing techniques for a deeply fused and heterogeneous many-core processor architecture. J. Comput. Sci. Technol. 30(1), 145–162 (2015)

    Google Scholar 

  25. IEEE Spectrum: Two Different Top500 Supercomputing Benchmarks Show Two Different Top Supercomputers (2017). https://spectrum.ieee.org/tech-talk/computing/hardware/two-different-top500-supercomputing-benchmarks-show-two-different-top-supercomputers

  26. www.studentclustercompetition.us

  27. Ippen, T., Eppler, J.M., Plesser, H.E., Diesmann, M.: Constructing neuronal network models in massively parallel environments. Front. Neuroinform. 11, 30 (2017). https://doi.org/10.3389/fninf.2017.00030

    Google Scholar 

  28. doaj.org

  29. Végh, J.: Introducing the explicitly many-processor approach. Parallel Comput. (2018)

    Google Scholar 

  30. Dettmers, T.: The Brain vs Deep Learning Part I: Computational Complexity or Why the Singularity Is Nowhere Near (2015). http://timdettmers.com/2015/07/27/brain-vs-deep-learning-singularity/

  31. HPCG Benchmark (2016). http://www.hpcg-benchmark.org/

  32. Markov, I.: Limits on fundamental limits to computation. Nature 512(7513), 147–154 (2014)

    Google Scholar 

  33. Top500.org: DOE Withholds Details of First Exascale Supercomputer (2018). https://www.top500.org/news/doe-witholds-details-of-first-exascale-supercomputer-even-as-it-solicits-researchers-to-apply-for-early-access/

  34. Végh, J.: Renewing computing paradigms for more efficient parallelization of single-threads (chap. 13). In: Advances in Parallel Computing, pp. 305–330. IOS Press (2018)

    Google Scholar 

Download references

Acknowledgements

Project no. 125547 has been implemented with the support provided from the National Research, Development and Innovation Fund of Hungary, financed under the K funding scheme [29].

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to János Végh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Végh, J., Vásárhelyi, J., Drótos, D. (2019). The Performance Wall of Large Parallel Computing Systems. In: Kabashkin, I., Yatskiv (Jackiva), I., Prentkovskis, O. (eds) Reliability and Statistics in Transportation and Communication. RelStat 2018. Lecture Notes in Networks and Systems, vol 68. Springer, Cham. https://doi.org/10.1007/978-3-030-12450-2_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-12450-2_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-12449-6

  • Online ISBN: 978-3-030-12450-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics