Abstract
Next-generation supercomputers are expected to consume tens of MW of electric power. The power is expected to instantaneously fluctuate between several MW to tens of MW during their execution. This fluctuation can cause voltage drops in regional power grids and affect the operation of chillers and generators in the computer’s facility. Predicting such fluctuations in advance can aid the safe operation of power grids and facility. Because abrupt fluctuations and a high average of consumed power are application-specific features, it is important to identify an application before job execution. This paper provides a methodology for classifying executed jobs into applications. By this method, various statistics for each application such as the number of executions, runtime, resource usage, and power consumption can be examined. To estimate the power consumed because of job execution, we propose a method to predict application characteristics using submitted job scripts. We demonstrate that 328 kinds of applications are executed in 273,121 jobs and that the application can be predicted with an accuracy of approximately 92%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Floating-point Operations Per Second.
- 2.
Basic Linear Algebra Subprograms.
- 3.
Fastest Fourier Transform library.
References
Agrawal, K., Fahey, M.R., McLay, R., James, D.: User environment tracking and problem detection with XALT. In: Proceedings of the First International Workshop on HPC User Support Tools, HUST 2014, pp. 32–40 (2014)
Andoh, Y., Yoshii, N., Fujimoto, K., Mizutani, K., Kojima, H., Yamada, A., Okazaki, S., Kawaguchi, K., Nagao, H., Iwahashi, K., Mizutani, F., Minami, K., Ichikawa, S., Komatsu, H., Ishizuki, S., Takeda, Y., Fukushima, M.: MODYLAS: a highly parallelized general-purpose molecular dynamics simulation program for large-scale systems with long-range forces calculated by fast multipole method (FMM) and highly scalable fine-grained new parallel processing algorithms. J. Chem. Theory Comput. 9(7), 3201–3209 (2013)
Borghesi, A., Bartolini, A., Lombardi, M., Milano, M., Benini, L.: Predictive modeling for job power consumption in HPC systems. In: Kunkel, J.M., Balaji, P., Dongarra, J. (eds.) ISC High Performance 2016. LNCS, vol. 9697, pp. 181–199. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41321-1_10
Budiardja, R.D., Agrawal, K., Fahey, M., McLay, R., James, D.: Library function tracking with XALT. In: Proceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at Scale, XSEDE 2016, pp. 30:1–30:7 (2016)
Chen, X., Lu, C.D., Pattabiraman, K.: Predicting job completion times using system logs in supercomputing clusters. In: 2013 43rd Annual IEEE/IFIP Conference on Dependable Systems and Networks Workshop (DSN-W), pp. 1–8, June 2013
Dongarra, J.: TOP500 Supercomputer Sites: Lists, November 2017. https://www.top500.org/lists/2017/11/
Dongarra, J., Heroux, M.A., Luszczek, P.: A new metric for ranking high-performance computing systems. Natl. Sci. Rev. 3(1), 30–35 (2016)
Gaussier, E., Glesser, D., Reis, V., Trystram, D.: Improving backfilling by using machine learning to predict running times. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2015, pp. 64:1–64:10 (2015)
Hadri, B., Fahey, M.: Mining software usage with the automatic library tracking database (ALTD). Procedia Comput. Sci. 18(Supplement C), 1834–1843 (2013). International Conference on Computational Science
Hutter, J., Iannuzzi, M., Schiffmann, F., VandeVondele, J.: CP2K: atomistic simulations of condensed matter systems. Wiley Interdisc. Rev. Comput. Mol. Sci. 4(1), 15–25 (2014)
Kresse, G., Hafner, J.: Ab initio molecular dynamics for liquid metals. Phys. Rev. B 47, 558–561 (1993)
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on International Conference on Machine Learning, ICML 2014, vol. 32, pp. II-1188–II-1196 (2014)
Lu, C.D.: Automatically mining program build information via signature matching. In: Proceedings of the 11th ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering, PASTE 2013, pp. 25–32 (2013)
McKenna, R., Herbein, S., Moody, A., Gamblin, T., Taufer, M.: Machine learning predictions of runtime and IO traffic on high-end clusters. In: 2016 IEEE International Conference on Cluster Computing, pp. 255–258 (2016)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Řehůřek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, pp. 45–50, May 2010
Shoukourian, H., Wilde, T., Auweter, A., Bode, A.: Predicting the energy and power consumption of strong and weak scaling HPC applications. Supercomput. Front. Innov. 1(2), 20–41 (2014)
Staples, G.: Torque resource manager. In: Proceedings of the 2006 ACM/IEEE Conference on Supercomputing, SC 2006 (2006)
Storlie, C., Sexton, J., Pakin, S., Lang, M., Reich, B., Rust, W.: Modeling and Predicting Power Consumption of High Performance Computing Jobs. ArXiv e-prints, December 2014
Tam, A.M.: Enabling process accounting on Linux HOWTO. http://www.tldp.org/HOWTO/text/Process-Accounting
Yamamoto, K., Uno, A., Murai, H., Tsukamoto, T., Shoji, F., Matsui, S., Sekizawa, R., Sueyasu, F., Uchiyama, H., Okamoto, M., Ohgushi, N., Takashina, K., Wakabayashi, D., Taguchi, Y., Yokokawa, M.: The k computer operations: experiences and statistics. Procedia Comput. Sci. 29, 576–585. International Conference on Computational Science (2014)
Yoo, A.B., Jette, M.A., Grondona, M.: SLURM: simple linux utility for resource management. In: Feitelson, D., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2003. LNCS, vol. 2862, pp. 44–60. Springer, Heidelberg (2003). https://doi.org/10.1007/10968987_3
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Yamamoto, K., Tsujita, Y., Uno, A. (2018). Classifying Jobs and Predicting Applications in HPC Systems. In: Yokota, R., Weiland, M., Keyes, D., Trinitis, C. (eds) High Performance Computing. ISC High Performance 2018. Lecture Notes in Computer Science(), vol 10876. Springer, Cham. https://doi.org/10.1007/978-3-319-92040-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-92040-5_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-92039-9
Online ISBN: 978-3-319-92040-5
eBook Packages: Computer ScienceComputer Science (R0)