Supercomputer Efficiency: Complex Approach Inspired by Lomonosov-2 History Evaluation
These days the number of supercomputer users and the jobs they execute is rapidly growing, especially for supercomputers, providing computing time to external users. Supercomputers and their computing time are highly expensive, so their efficiency is crucial for both users and owners. There are several ways to increase operational efficiency, however, in most cases it involves a trade-off between efficiency metrics. This brings about a need to define “efficiency” in each specific case. We use the historical data from two largest Russian supercomputers to create a number of metrics in order to provide the definition of resource management “efficiency”. The data from both Lomonosov and Lomonosov-2 supercomputers consists of over one year history of job executions. Lomonosov and Lomonosov-2 efficiency in terms of CPU hours utilization is considerably high, nevertheless, our global goal is to offer the way to maintain or improve this metric when maximizing others examined in the paper.
KeywordsHigh-performance computing Resource management Supercomputer job scheduling efficiency
This material is based upon the work supported by Russian Foundation for Basic Research (Agreement N 17-07-00664 A).
- 2.Slurm workload manager (2015). http://slurm.schedmd.com/slurm.html
- 3.Sadovnichy, V., Tikhonravov, A.: LOMONOSOV: supercomputing at moscow state university. In: Contemporary High Performance Computing: From Petascale toward Exascale, pp. 283–307 (2013)Google Scholar
- 4.Lomonosov—T-Platforms (2015). http://www.top500.org/system/177421
- 5.Lipari, D.: The SLURM Scheduler Design (2012). http://slurm.schedmd.com/slurm_ug_2012/SUG-2012-Scheduling.pdf
- 6.Jones, M.: Optimization of resource management using supercomputers SLURM (2012).http://www.ibm.com/developerworks/ru/library/l-slurm-utility/
- 7.Lomonosov-2 supercomputer configuration (2018). http://users.parallel.ru/wiki/pages/22-config
- 8.Lomonosov-2 supercomputer on TOP50 list (2018). http://top50.supercomputers.ru/?page=stat&sub=ext&id=593
- 9.Antonov, A., et al.: An approach for ensuring reliable functioning of a supercomputer based on a formal model. In: Wyrzykowski, R., Deelman, E., Dongarra, J., Karczewski, K., Kitowski, J., Wiatr, K. (eds.) PPAM 2015. LNCS, vol. 9573, pp. 12–22. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-32149-3_2CrossRefGoogle Scholar