Abstract
Since the beginning of using parallelized computing units it is known that the actual performance is less than the possible nominal performance: the unproductive part of the computing performance remains “dark”. It is also known that the amount of dark performance strongly depends on the number of parallelly working processing units, so its role must be crucial for supercomputers, where in the coming exa-scale models millions of processors are utilized, as well as for the exa-scale applications they are running, like brain simulation and Earth simulation. Although the effects affecting parallel performance are known from the beginning, their relative weights have been considerably changed with the development of the field and strongly depend on the type of application. For large computer systems the “dark performance” represents a new major obstacle, in addition the former ones, like “heat wall”, “memory wall”, “dark silicon”, etc. The careful reconsideration discovers that in contrast with the general belief, supercomputer performance has an upper limit, and reaching that limit explains some strange and mysterious events, like canceling projects immediately before their target date or that the special-purpose brain simulator cannot outperform the many-thread simulator running on a general-purpose supercomputer.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Dongarra, J.: The Global Race for Exascale High Performance Computing (2017). http://ec.europa.eu/newsroom/document.cfm?doc_id=45647
Bourzac, K.: Streching supercomputers to the limit. Nature 551, 554–556 (2017)
Esmaeilzadeh, H.: Approximate acceleration: a path through the era of dark silicon and big data. In: Proceedings of the 2015 International Conference on Compilers, Architecture and Synthesis for Embedded Systems, CASES 2015, pp. 31–32
Fuller, S.H., Millett, L.I.: Computing performance: game over or next level? Computer 44, 31–38 (2011)
Fu, H., Liao, J., Yang, J., Wang, L., Song, Z., Huang, X., Yang, C., Xue, W., Liu, F., Qiao, F., Zhao, W., Yin, X., Hou, C., Zhang, C., Ge, W., Zhang, J., Wang, Y., Zhou, C., Yang, G.: The Sunway TaihuLight supercomputer: system and applications. Sci. China Inf. Sci. 59(7), 1–16 (2016)
Végh, J.: Limitations of performance of exascale applications and supercomputers they are running on. ArXiv e-prints; Submitted to Special Issue of IEEE J. Parallel Distrib. Comput. (2018)
Top500.org: Retooled Aurora Supercomputer will be Americas First Exascale System (2017). https://www.top500.org/news/retooled-aurora-supercomputer-will-be-americas-first-exascale-system/
Inside HPC: Is Aurora Morphing into an Exascale AI Supercomputer? (2017). https://insidehpc.com/2017/06/told-aurora-morphing-novel-architecture-ai-supercomputer/
European Commission: Implementation of the Action Plan for the European High-Performance Computing strategy (2016). http://ec.europa.eu/newsroom/dae/document.cfm?docid=15269
US DOE: The Opportunities and Challenges of Exascale Computing (2010). https://science.energy.gov/~/media/ascr/ascac/pdf/reports/Exascale_subcommittee_report.pdf
van Albada, S.J., Rowley, A.G., Senk, J., Hopkins, M., Schmidt, M., Stokes, A.B., et al.: Performance comparison of the digital neuromorphic hardware SpiNNaker and the neural network simulation software NEST for a full-scale cortical microcircuit model. Front. Neurosci. 12, 291 (2018). https://doi.org/10.3389/fnins.2018.00291
Paul, J.M., Meyer, B.H.: Amdahl’s law revisited for single chip systems. Int. J. Parallel Prog. 35(2), 101–123 (2007)
Amdahl, G. M.: Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the April 18-20 1967 spring joint computer conference on - AFIPS 67 (Spring) AFIPS 67 (Spring) (1967)
Végh, J., Molnár, P.: How to measure perfectness of parallelization in hardware/software systems. In: 18th International Carpathian Control Conference on ICCC, pp. 394–399 (2017)
Amdahl, G.M.: Validity of the single processor approach to achieving large-scale computing capabilities. In: AFIPS Conference Proceedings, vol. 30, pp. 483–485 (1967)
Karp, A.H., Flatt, H.P.: Measuring parallel processor performance. Commun. ACM 33(5), 539–543 (1990)
TOP500: November 2017 List of Supercomputers (2017). https://www.top500.org/lists/2017/11/
Ellen, F., Hendler, D., Shavit, N.: On the inherent sequentiality of concurrent objects. SIAM J. Comput. 43(3), 519536 (2012). https://doi.org/10.1137/08072646X
Yavits, L., Morad, A., Ginosar, R.: The effect of communication and synchronization on Amdahl’s law in multicore systems. Parallel Comput. 40(1), 1–16 (2014)
Denning, P.J., Lewis, T.: Exponential laws of computing growth. Commun. ACM 54–65 (2017)
Dongarra, J.: Report on the Sunway Taihulight system. Technical report UT-EECS-16-742 (2016)
Tsafrir, D.: The context-switch overhead inflicted by hardware interrupts (and the enigma of do-nothing loops). In: Proceedings of the 2007 Workshop on Experimental Computer Science, ExpCS 2007, p. 3. ACM, New York (2007). http://doi.acm.org/10.1145/1281700.1281704
Zheng, Fang, et al.: Cooperative computing techniques for a deeply fused and heterogeneous many-core processor architecture. J. Comput. Sci. Technol. 30(1), 145–162 (2015)
IEEE Spectrum: Two Different Top500 Supercomputing Benchmarks Show Two Different Top Supercomputers (2017). https://spectrum.ieee.org/tech-talk/computing/hardware/two-different-top500-supercomputing-benchmarks-show-two-different-top-supercomputers
Ippen, T., Eppler, J.M., Plesser, H.E., Diesmann, M.: Constructing neuronal network models in massively parallel environments. Front. Neuroinform. 11, 30 (2017). https://doi.org/10.3389/fninf.2017.00030
Végh, J.: Introducing the explicitly many-processor approach. Parallel Comput. (2018)
Dettmers, T.: The Brain vs Deep Learning Part I: Computational Complexity or Why the Singularity Is Nowhere Near (2015). http://timdettmers.com/2015/07/27/brain-vs-deep-learning-singularity/
HPCG Benchmark (2016). http://www.hpcg-benchmark.org/
Markov, I.: Limits on fundamental limits to computation. Nature 512(7513), 147–154 (2014)
Top500.org: DOE Withholds Details of First Exascale Supercomputer (2018). https://www.top500.org/news/doe-witholds-details-of-first-exascale-supercomputer-even-as-it-solicits-researchers-to-apply-for-early-access/
Végh, J.: Renewing computing paradigms for more efficient parallelization of single-threads (chap. 13). In: Advances in Parallel Computing, pp. 305–330. IOS Press (2018)
Acknowledgements
Project no. 125547 has been implemented with the support provided from the National Research, Development and Innovation Fund of Hungary, financed under the K funding scheme [29].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Végh, J., Vásárhelyi, J., Drótos, D. (2019). The Performance Wall of Large Parallel Computing Systems. In: Kabashkin, I., Yatskiv (Jackiva), I., Prentkovskis, O. (eds) Reliability and Statistics in Transportation and Communication. RelStat 2018. Lecture Notes in Networks and Systems, vol 68. Springer, Cham. https://doi.org/10.1007/978-3-030-12450-2_21
Download citation
DOI: https://doi.org/10.1007/978-3-030-12450-2_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-12449-6
Online ISBN: 978-3-030-12450-2
eBook Packages: EngineeringEngineering (R0)