Sunway supercomputer architecture towards exascale computing: analysis and practice

Abstract

In recent years, the improvements of system performance and energy efficiency for supercomputers have faced increasing challenges, which create more intensive demands on the architecture design for realizing exascale computing. This paper first analyzes the main requirements of exascale computing on the aspects of the parallel computing application and supercomputing center operation. Afterwards, a mapping scheme of “demands-challenges-architecture” is proposed. Then, the major challenges of exascale supercomputer, such as scalability, power consumption, data movement, programming and availability, are thoroughly analyzed, and the corresponding appropriate solutions are proposed. Moreover, this paper proposes the Sunway computer architecture towards exascale computing in which the many-core processor, network chipset and software system are all domestically-designed. The technology roadmap of Sunway supercomputer will hold the comprehensive design methods for the architecture, including the processor, interconnect network, assembly structure, power supply, cooling system, system software, parallel algorithm and application support, promising great advances for exascale supercomputing.

This is a preview of subscription content, access via your institution.

References

  1. 1

    Moore G E. Cramming more components onto integrated circuits, reprinted from electronics, volume 38, number 8, April 19, 1965, pp.114 ff. IEEE Solid-State Circuits Soc Newsl, 2006, 11: 33–35

    Article  Google Scholar 

  2. 2

    Dennard R H, Gaensslen F H, Yu H N, et al. Design of ion-implanted MOSFET’s with very small physical dimensions. IEEE J Solid-State Circ, 1974, 9: 256–268

    Article  Google Scholar 

  3. 3

    Agerwala T. Challenges on the road to exascale computing. In: Proceedings of the 22nd Annual International Conference on Supercomputing, 2008. 2

  4. 4

    Alvin K, Barrett B, Brightwell R, et al. On the path to exascale. Int J Distrib Syst Technol, 2010, 1: 1–22

    Article  Google Scholar 

  5. 5

    Beckman P. Looking toward exascale computing. In: Proceedings of the 9th International Conference on Parallel and Distributed Computing, Applications and Technologies, 2008. 3

  6. 6

    Balaprakash P, Buntinas D, Chan A, et al. Exascale workload characterization and architecture implications. In: Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2013. 120–121

  7. 7

    Dally B. Power, programmability, and granularity: the challenges of exascale computing. In: Proceedings of IEEE International Test Conference, 2011. 12

  8. 8

    Hluchy L, Bobák M, Müller H, et al. Heterogeneous exascale computing. In: Recent Advances in Intelligent Engineering. Cham: Springer, 2020. 81–110

    Google Scholar 

  9. 9

    Kogge P M, Shalf J. Exascale computing trends: adjusting to the “new normal” for computer architecture. Comput Sci Eng, 2013, 15: 16–26

    Article  Google Scholar 

  10. 10

    Lu Y. Paving the way for China exascale computing. CCF Trans HPC, 2019, 1: 63–72

    Article  Google Scholar 

  11. 11

    Shalf J, Dosanjh S S, Morrison J P. Exascale computing technology challenges. In: Proceedings of the 9th International Conference on High Performance Computing for Computational Science, 2010. 1–25

  12. 12

    Vijayaraghavany T, Eckert Y, Loh G H, et al. Design and analysis of an APU for exascale computing. In: Proceedings of IEEE International Symposium on High Performance Computer Architecture (HPCA), 2017. 85–96

  13. 13

    Feng J Q, Gu W D, Pan J S, et al. Parallel implementation of BP neural network for traffic prediction on Sunway Blue Light supercomputer. Appl Mech Mater, 2014, 614: 521–525

    Article  Google Scholar 

  14. 14

    Tian M, Gu W, Pan J, et al. Performance analysis and optimization of PalaBos on petascale Sunway BlueLight MPP supercomputer. In: Proceedings of International Conference on Parallel Computing in Fluid Dynamics, 2013. 311–320

  15. 15

    Chen Y, Li K, Yang W, et al. Performance-aware model for sparse matrix-matrix multiplication on the Sunway TaihuLight supercomputer. IEEE Trans Parallel Distrib Syst, 2019, 30: 923–938

    Article  Google Scholar 

  16. 16

    Fang J, Fu H, Zhao W, et al. swDNN: a library for accelerating deep learning applications on Sunway TaihuLight. In: Proceedings of IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2017. 615–624

  17. 17

    Fu H H, Liao J F, Yang J Z, et al. The Sunway TaihuLight supercomputer: system and applications. Sci China Inf Sci, 2016, 59: 072001

    Article  Google Scholar 

  18. 18

    Zhang J, Zhou C, Wang Y, et al. Extreme-scale phase field simulations of coarsening dynamics on the Sunway TaihuLight supercomputer. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2016. 4

  19. 19

    Zheng F, Xu Y, Li H L, et al. A homegrown many-core processor architecture for high-performance computing. Sci Sin Inform, 2015, 45: 523–534

    Article  Google Scholar 

  20. 20

    Lin H, Zhu X, Yu B, et al. ShenTu: processing multi-trillion edge graphs on millions of cores in seconds. In: Proceedings of International Conference for High Performance Computing, Networking, Storage and Analysis, 2018. 56

  21. 21

    Meng D-L, Wen M-H, Wei J-W, et al. Porting and optimizing OpenFOAM on Sunway TaihuLight system. Comput Sci, 2017, 44: 64–70

    Google Scholar 

  22. 22

    Fu H, Liu W, Wang L, et al. Redesigning CAM-SE for peta-scale climate modeling performance and ultra-high resolution on Sunway TaihuLight. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2017. 1

  23. 23

    Fu H, Yin W, Yang G, et al. 18.9-PFlops nonlinear earthquake simulation on Sunway TaihuLight: enabling depiction of 18-Hz and 8-meter scenarios. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2017. 2

  24. 24

    Williams S, Patterson D A, Oliker L, et al. The roofline model: a pedagogical tool for auto-tuning kernels on multicore architectures. In: Proceedings of Symposium on High Performance Chips, Stanford, 2008

  25. 25

    Oral S, Vazhkudai S S, Wang F, et al. End-to-end I/O portfolio for the summit supercomputing ecosystem. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, 2019. 1–14

  26. 26

    Shi X, Li M, Liu W, et al. SSDUP: a traffic-aware ssd burst buffer for HPC systems. In: Proceedings of the International Conference on Supercomputing, 2017. 1–10

  27. 27

    Shi X, Liu W, He L, et al. Optimizing the SSD burst buffer by traffic detection. ACM Trans Archit Code Opt, 2020, 17: 1–26

    Article  Google Scholar 

  28. 28

    He W Q, L Y, Fang Y F, et al. Design and implementation of Parallel C programming language for domestic heterogeneous many-core systems. J Softw, 2017, 28: 764–785

    Google Scholar 

  29. 29

    Schroeder B, Gibson G A. A large-scale study of failures in high-performance computing systems. IEEE Trans Dependable Secure Comput, 2010, 7: 337–350

    Article  Google Scholar 

  30. 30

    Cappello F. Resilience: One of the Main Challenges for Exascale Computing. Technical Report of the INRIA-Illinois Joint Laboratory, 2011

  31. 31

    Kusnezov D. DOE exascale Initiative. 2013. https://www.energy.gov/downloads/doe-exascale-initiative

  32. 32

    Asanovic K, Bodik R, Catanzaro B C, et al. The Landscape of Parallel Computing Research: A View from Berkeley. Technical Report Uc Berkeley. eecs-2006-183. 2006

  33. 33

    Chao Y, Wei X, Fu H, et al. 10M-core scalable fully-implicit solver for nonhydrostatic atmospheric dynamics. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2016. 6

  34. 34

    Qiao F, Zhao W, Yin X, et al. A highly effective global surface wave numerical simulation with ultra-high resolution. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2016. 5

  35. 35

    Fu H, Liao J, Xue W, et al. Refactoring and optimizing the community atmosphere model (CAM) on the Sunway TaihuLight supercomputer. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2016. 83

  36. 36

    Liu J, Qin H, Wang Y, et al. Largest particle simulations downgrade the runaway electron risk for ITER. 2016. ArXiv: 1611.02362

  37. 37

    Dong W, Kang L, Quan Z, et al. Implementing molecular dynamics simulation on Sunway TaihuLight system. In: Proceedings of IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS), 2016. 443–450

  38. 38

    Duan X, Xu K, Chan Y, et al. S-Aligner: ultrascalable read mapping on Sunway TaihuLight. In: Proceedings of IEEE International Conference on Cluster Computing (CLUSTER), 2017

  39. 39

    Yao W J, Chen J S, Su Z-C, et al. Porting and optimizing of NAMD on SunwayTaihuLight system. Comput Eng Sci, 2017, 39: 1022–1030

    Google Scholar 

Download references

Acknowledgements

This work was supported by National Key Research and Development Project of China (Grant No. 2016YFB-0200500).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Jiangang Gao.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Gao, J., Zheng, F., Qi, F. et al. Sunway supercomputer architecture towards exascale computing: analysis and practice. Sci. China Inf. Sci. 64, 141101 (2021). https://doi.org/10.1007/s11432-020-3104-7

Download citation

Keywords

  • supercomputer
  • exascale
  • Sunway
  • scalability
  • power consumption
  • data movement
  • programming
  • availability