AccaSim: An HPC Simulator for Workload Management

Galleguillos, Cristian; Kiziltan, Zeynep; Netti, Alessio

doi:10.1007/978-3-319-73353-1_12

Cristian Galleguillos^11,12,
Zeynep Kiziltan¹¹ &
Alessio Netti¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 796))

Included in the following conference series:

Latin American High Performance Computing Conference

878 Accesses
3 Citations

Abstract

We present AccaSim, an HPC simulator for workload management. Thanks to the scalability and high customizability features of AccaSim, users can easily represent various real HPC system resources, develop dispatching methods and carry out large experiments across different workload sources. AccaSim is thus an attractive tool for conducting controlled experiments in HPC dispatching research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://www.prace-ri.eu/praceannualreports/.
2.
http://www2.itif.org/2016-high-performance-computing.pdf.
3.
https://www.python.org/events/python-events/.
4.
https://pypi.org.
5.
https://www.hpc2n.umu.se/resources/hardware/seth.
6.
http://www.cs.huji.ac.il/labs/parallel/workload/l_hpc2n/index.html.
7.
http://www.top500.org/.
8.
Slurm Workload Manager: https://slurm.schedmd.com/.
9.
https://www.spec.org/power_ssj2008/.
10.
http://www.omnetpp.org/.

References

Acun, B., Jain, N., Bhatele, A., Mubarak, M., Carothers, C.D., Kalé, L.V.: Preliminary evaluation of a parallel trace replay tool for HPC network simulations. In: Hunold, S., Costan, A., Giménez, D., Iosup, A., Ricci, L., Gómez Requena, M.E., Scarano, V., Varbanescu, A.L., Scott, S.L., Lankes, S., Weidendorfer, J., Alexander, M. (eds.) Euro-Par 2015. LNCS, vol. 9523, pp. 417–429. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-27308-2_34
Chapter Google Scholar
Auweter, A., Bode, A., Brehm, M., Brochard, L., Hammer, N., Huber, H., Panda, R., Thomas, F., Wilde, T.: A case study of energy aware scheduling on SuperMUC. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2014. LNCS, vol. 8488, pp. 394–409. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-07518-1_25
Google Scholar
Banerjee, A., Mukherjee, T., Varsamopoulos, G., Gupta, S.K.: Integrating cooling awareness with thermal aware workload placement for HPC data centers. Sustain. Comput. Inf. Syst. 1(2), 134–150 (2011)
Google Scholar
Blazewicz, J., Lenstra, J.K., Kan, A.H.G.R.: Scheduling subject to resource constraints: classification and complexity. Discrete Appl. Math. 5(1), 11–24 (1983)
Article MathSciNet MATH Google Scholar
Bodas, D., Song, J., Rajappa, M., Hoffman, A.: Simple power-aware scheduler to limit power consumption by HPC system within a budget. In: Proceedings of E2SC@SC, pp. 21–30. IEEE (2014)
Google Scholar
Borghesi, A., Collina, F., Lombardi, M., Milano, M., Benini, L.: Power capping in high performance computing systems. In: Pesant, G. (ed.) CP 2015. LNCS, vol. 9255, pp. 524–540. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23219-5_37
Google Scholar
Brandt, J.M., Debusschere, B.J., Gentile, A.C., Mayo, J., Pébay, P.P., Thompson, D.C., Wong, M.: Using probabilistic characterization to reduce runtime faults in HPC systems. In: Proceedings of CCGRID, pp. 759–764. IEEE CS (2008)
Google Scholar
Brennan, J., Kureshi, I., Holmes, V.: CDES: an approach to HPC workload modelling. In: Proceedings of DS-RT, pp. 47–54. IEEE CS (2014)
Google Scholar
Bridi, T., Bartolini, A., Lombardi, M., Milano, M., Benini, L.: A constraint programming scheduler for heterogeneous high-performance computing machines. IEEE Trans. Parallel Distrib. Syst. 27(10), 2781–2794 (2016)
Article Google Scholar
Feitelson, D.G.: Metrics for parallel job scheduling and their convergence. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 2001. LNCS, vol. 2221, pp. 188–205. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-45540-X_11
Chapter Google Scholar
Feitelson, D.G., Tsafrir, D., Krakov, D.: Experience with using the parallel workloads archive. J. Parallel Distrib. Comput. 74(10), 2967–2982 (2014)
Article Google Scholar
Gómez-Martín, C., Vega-Rodríguez, M.A., Sánchez, J.L.G.: Performance and energy aware scheduling simulator for HPC: evaluating different resource selection methods. Concurrency Comput. Pract. Exp. 27(17), 5436–5459 (2015)
Article Google Scholar
Hurst, W.B., Ramaswamy, S., Lenin, R.B., Hoffman, D.: Modeling and simulation of HPC systems through job scheduling analysis. In: Conference on Applied Research in Information Technology. Acxiom Laboratory of Applied Research (2010)
Google Scholar
Jain, N., Bhatele, A., White, S., Gamblin, T., Kalé, L.V.: Evaluating HPC networks via simulation of parallel workloads. In: Proceedings of SC, pp. 154–165. IEEE CS (2016)
Google Scholar
Li, Y., Gujrati, P., Lan, Z., Sun, X.: Fault-driven re-scheduling for improving system-level fault resilience. In: Proceedings of ICPP, pp. 39. IEEE CS (2007)
Google Scholar
Lucero, A.: Simulation of batch scheduling using real production-ready software tools. In: Proceedings of IBERGRID, pp. 345–356, Netbiblo (2011)
Google Scholar
Mubarak, M., Carothers, C.D., Ross, R.B., Carns, P.H.: Enabling parallel simulation of large-scale HPC network systems. IEEE Trans. Parallel Distrib. Syst. 28(1), 87–100 (2017)
Article Google Scholar
Nuñez, A., Fernández, J., García, J.D., García, F., Carretero, J.: New techniques for simulating high performance MPI applications on large storage networks. J. Supercomput. 51(1), 40–57 (2010)
Article Google Scholar
Rodrigo, G.P., Elmroth, E., Östberg, P.-O., Lavanya, R.: ScSF: a scheduling simulation framework. To appear in the Proceedings of JSSPP. Springer (2017)
Google Scholar
Skovira, J., Chan, W., Zhou, H., Lifka, D.: The EASY — LoadLeveler API project. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 1996. LNCS, vol. 1162, pp. 41–47. Springer, Heidelberg (1996). https://doi.org/10.1007/BFb0022286
Chapter Google Scholar
Snyder, S., Carns, P.H., Latham, R., Mubarak, M., Ross, R.B., Carothers, C.D., Behzad, B., Luu, H.V.T., Byna, S., Prabhat, S.: Techniques for modeling large-scale HPC I/O workloads. In: Proceedings of PMBS@SC, pp. 5:1–5:11. ACM (2015)
Google Scholar
Stephen, T., Benini, M.: Using and modifying the BSC slurm workload simulator, Technical report, Slurm User Group Meeting (2015)
Google Scholar
Tang, Q., Gupta, S.K.S., Varsamopoulos, G.: Energy-efficient thermal-aware task scheduling for homogeneous high-performance computing data centers: a cyber-physical approach. IEEE Trans. Parallel Distrib. Syst. 19(11), 1458–1472 (2008)
Article Google Scholar
Zhou, Z., Lan, Z., Tang, W., Desai, N.: Reducing energy costs for IBM blue gene/P via power-aware job scheduling. In: Desai, N., Cirne, W. (eds.) JSSPP 2013. LNCS, vol. 8429, pp. 96–115. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-43779-7_6
Google Scholar

Download references

Acknowledgments

C. Galleguillos is supported by Postgraduate Grant PUCV 2017. We thank Alina Sîrbu for fruitful discussions on the work presented here.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University of Bologna, Bologna, Italy
Cristian Galleguillos, Zeynep Kiziltan & Alessio Netti
Escuela de Ing. Informática, Pontificia Universidad Católica de Valparaíso, Valparaíso, Chile
Cristian Galleguillos

Authors

Cristian Galleguillos
View author publications
You can also search for this author in PubMed Google Scholar
Zeynep Kiziltan
View author publications
You can also search for this author in PubMed Google Scholar
Alessio Netti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cristian Galleguillos .

Editor information

Editors and Affiliations

CSC-CONICET and Universidad de Buenos Aires, Buenos Aires, Argentina
Esteban Mocskos
Universidad de la República, Montevideo, Uruguay
Sergio Nesmachnow

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Galleguillos, C., Kiziltan, Z., Netti, A. (2018). AccaSim: An HPC Simulator for Workload Management. In: Mocskos, E., Nesmachnow, S. (eds) High Performance Computing. CARLA 2017. Communications in Computer and Information Science, vol 796. Springer, Cham. https://doi.org/10.1007/978-3-319-73353-1_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-73353-1_12
Published: 28 December 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73352-4
Online ISBN: 978-3-319-73353-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics