Cost and Performance Modeling for Earth System Data Management and Beyond

  • Jakob LüttgauEmail author
  • Julian Kunkel
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11203)


Current and anticipated storage environments confront domain scientist and data center operators with usability, performance and cost challenges. The amount of data upcoming system will be required to handle is expected to grow exponentially, mainly due to increasing resolution and affordable compute power. Unfortunately, the relationship between cost and performance is not always well understood requiring considerable effort for educated procurement. Within the Centre of Excellence in Simulation of Weather and Climate in Europe (ESiWACE) models to better understand cost and performance of current and future systems are being explored. This paper presents models and methodology focusing on, but not limited to, data centers used in the context of climate and numerical weather prediction. The paper concludes with a case study of alternative deployment strategies and outlines the challenges anticipating their impact on cost and performance. By publishing these early results, we would like to make the case to work towards standard models and methodologies collaboratively as a community to create sufficient incentives for vendors to provide specifications in formats which are compatible to these modeling tools. In addition to that, we see application for such formalized models and information in I/O related middleware, which are expected to make automated but reasonable decisions in increasingly heterogeneous data centers.


Storage Data management Earth systems TCO Cost 



The ESiWACE project received funding from the EU Horizon 2020 research and innovation programme under grant agreement No 675191.


  1. 1.
    Performance Evaluation of the PVFS2 Architecture. Napoli, ItalyGoogle Scholar
  2. 2.
    SST Simulator - The Structural Simulation Toolkit.
  3. 3.
    Arjona, J.O.: Using UML state diagrams for modelling the performance of parallel programs. Computación y Sistemas 11(3), 199–210 (2008)Google Scholar
  4. 4.
    Carothers, C.: ROSS: rensselaer’s optimistic simulation system, November 2017.
  5. 5.
  6. 6.
    ESiWACE: Centre of excellence in simulation of weather and climate in Europe.
  7. 7.
  8. 8.
    Fontana, R.E., Decad, G.M., Hetzler, S.R.: The impact of areal density and millions of square inches (MSI) of produced memory on petabyte shipments of TAPE, NAND flash, and HDD storage class memories. In: 2013 IEEE 29th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–8. IEEE (2013).
  9. 9.
    HPSS collaboration: list of sites (2018).
  10. 10.
    Intel, T.: HDF Group, EMC, Cray: Fast Forward Storage and I/O, June 2014Google Scholar
  11. 11.
    Luettgau, J., Kunkel, J., Jensen, J., Lawrence, B.: ESIWACE D4.1 business model with alternative scenarios. Technical report.
  12. 12.
    Kunkel, J.M., Ludwig, T.: IOPm - modeling the I/O path with a functional representation of parallel file system and hardware architecture. In: 20th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (2012).
  13. 13.
    Luettgau, J., Kunkel, J.: Simulation of hierarchical storage systems for TCO and QoS. In: Kunkel, J.M., Yokota, R., Taufer, M., Shalf, J. (eds.) High Performance Computing, pp. 132–144. Springer International Publishing, Cham (2017). Scholar
  14. 14.
    Mubarak, M., Carothers, C.D., Ross, R., Carns, P.: Modeling a million-node dragonfly network using massively parallel discrete-event simulation. In: 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, pp. 366–376. November 2012.
  15. 15.
    NEXTGenIO: next generation I/O for the exascale.
  16. 16.
    Overpeck, J.T., Meehl, G.A., Bony, S., Easterling, D.R.: Climate data challenges in the 21st century. Science 331(6018), 700–702 (2011)CrossRefGoogle Scholar
  17. 17.
    Llopis, P., Dolz, M.F., Blas, J.G., Isaila, F., Heidari, M.R., Kuhn, M.: Analyzing the energy consumption of the storage data path. J. Supercomput. 72(11), 4089–4106 (2016). Scholar
  18. 18.
    Pentzaropoulos, G.: Computer performance modelling: an overview. Appl. Math. Model. 6(2), 74–80 (1982)CrossRefGoogle Scholar
  19. 19.
    Pereverzeva, I., Laibinis, L., Troubitsyna, E., Holmberg, M., Pöri, M.: Formal modelling of resilient data storage in cloud. In: Groves, L., Sun, J. (eds.) ICFEM 2013. LNCS, vol. 8144, pp. 363–379. Springer, Heidelberg (2013). Scholar
  20. 20.
    Tribastone, M., Gilmore, S.: Automatic extraction of PEPA performance models from UML activity diagrams annotated with the MARTE profile. In: Proceedings of the 7th International Workshop on Software and Performance, WOSP 2008, pp. 67–78. ACM (2008).
  21. 21.
    Zhang, Y., Myers, D.S., Arpaci-Dusseau, A.C., Arpaci-Dusseau, R.H.: Zettabyte reliability with flexible end-to-end data integrity, pp. 1–14. IEEE May 2013.,
  22. 22.
    Zhao, T., March, V., Dong, S., See, S.: Evaluation of a performance model of Lustre file system. In: 2010 Fifth Annual ChinaGrid Conference (ChinaGrid), pp. 191–196. IEEE (2010)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Deutsches KlimarechenzentrumHamburgGermany
  2. 2.University of ReadingReadingUK

Personalised recommendations