Skip to main content

Early Evaluation of the “Infinite Memory Engine” Burst Buffer Solution

  • Conference paper
  • First Online:
High Performance Computing (ISC High Performance 2016)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9945))

Included in the following conference series:

Abstract

Hierarchical storage architectures are required to meet both, capacity and bandwidth requirements for future high-end storage architectures. In this paper we present the results of an evaluation of an emerging technology, DataDirect Networks’ (DDN) Infinite Memory Engine (IME). IME allows to realize a fast buffer in front of a large capacity storage system. We collected benchmarking data with IOR and with the HPC application NEST. The IOR bandwidth results show how well network bandwidth towards such fast buffer can be exploited compared to the external storage system. The NEST benchmarks clearly demonstrate that IME can reduce I/O-induced load imbalance between MPI ranks to a minimum while speeding up I/O as a whole by a considerable factor.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For the setup that was put in place in December 2015, MPI I/O was not sufficiently stable to obtain coherent results for IOR using MPIIO or HDF5 as I/O API.

  2. 2.

    Because this I/O approach does not scale well on supercomputers, right now a new I/O subsystem for NEST based on libraries for parallel I/O is developed [14].

  3. 3.

    http://www.nersc.gov/users/computational-systems/cori/burst-buffer/example-batch-scripts/.

References

  1. Gray, J., Shenoy, P.: Rules of thumb in data engineering, pp. 3–10 (2000)

    Google Scholar 

  2. Bent, J., Grider, G., Kettering, B., Manzanares, A., McClelland, M., Torres, A., Torrez, A.: Storage challenges at Los Alamos national lab. In: 2012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–5, April 2012

    Google Scholar 

  3. Moody, A., Bronevetsky, G., Mohror, K., de Supinski, B.: Design, modeling, and evaluation of a scalable multi-level checkpointing system. In: 2010 International Conference for High Performance Computing, Networking, Storage and Analysis (SC), pp. 1–11, November 2010

    Google Scholar 

  4. Bent, J., Gibson, G., Grider, G., McClelland, B., Nowoczynski, P., Nunez, J., Polte, M., Wingate, M.: PLFs: a checkpoint filesystem for parallel applications. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, pp. 1–12, November 2009

    Google Scholar 

  5. El Sayed, S., Graf, S., Hennecke, M., Pleiter, D., Schwarz, G., Schick, H., Stephan, M.: Using GPFS to manage NVRAM-based storage cache. In: Kunkel, J.M., Ludwig, T., Meuer, H.W. (eds.) ISC 2013. LNCS, vol. 7905, pp. 435–446. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  6. Liu, N., Cope, J., Carns, P., Carothers, C., Ross, R., Grider, G., Crume, A., Maltzahn, C.: On the role of burst buffers in leadership-class storage systems. In: 2012 IEEE 28th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–11, April 2012

    Google Scholar 

  7. Carns, P., Harms, K., Allcock, W., Bacon, C., Lang, S., Latham, R., Ross, R.: Understanding and improving computational science storage access through continuous characterization. In: 2011 IEEE 27th Symposium on Mass Storage Systems and Technologies (MSST), pp. 1–14, May 2011

    Google Scholar 

  8. Kannan, S., Gavrilovska, A., Schwan, K., Milojicic, D., Talwar, V.: Using active NVRAM for I/O staging. In: PDAC 2011, pp. 15–22. ACM, New York (2011)

    Google Scholar 

  9. He, J., Jagatheesan, A., Gupta, S., Bennett, J., Snavely, A.: Dash: a recipe for a flash-based data intensive supercomputer. In: 2010 International Conference for High Performance Computing, Networking, Storage and Analysis (SC), pp. 1–11, November 2010

    Google Scholar 

  10. Abbasi, H., Wolf, M., Eisenhauer, G., Klasky, S., Schwan, K., Zheng, F.: DataStager: scalable data staging services for petascale applications. In: HPDC 2009, pp. 39–48. ACM, New York (2009)

    Google Scholar 

  11. Docan, C., Parashar, M., Klasky, S.: Enabling high-speed asynchronous data extraction and transfer using dart. Concur. Comput. Pract. Exper. 22(9), 1181–1204 (2010)

    Google Scholar 

  12. Gewaltig, M.O., Diesmann, M.: NEST (Neural Simulation Tool). Scholarpedia 2(4), 1430 (2007)

    Article  Google Scholar 

  13. Kunkel, S., Schmidt, M., Eppler, J.M., Plesser, H.E., Masumoto, G., Igarashi, J., Ishii, S., Fukai, T., Morrison, A., Diesmann, M., Helias, M.: Spiking network simulation code for petascale computers. Front. Neuroinform. 8, Article number 78 (2014)

    Google Scholar 

  14. Schumann, T., Frings, W., Peyser, A., Schenck, W., Thust, K., Eppler, J.M.: Modeling the I/O behavior of the NEST simulator using a proxy. In: Elgeti, S., Simon, J.W. (eds.) Conference Proceedings of the YIC GACM 2015, Aachen (Germany), pp. 213–216. Publication Server of RWTH Aachen University (2015)

    Google Scholar 

  15. Schenck, W., Adinetz, A.V., Zaytsev, Y.V., Pleiter, D., Morrison, A.: Performance model for large-scale neural simulations with NEST. In: SC14 Conference for Supercomputing (Extended Poster Abstracts), New Orleans (LA), November 2014

    Google Scholar 

  16. Morrison, A., Aertsen, A., Diesmann, M.: Spike-timing dependent plasticity in balanced random networks. Neural Comput. 19(6), 1437–1467 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  17. Lujan, J., et al.: APEX workflows. Technical report, LANL, NERSC, SNL (2015)

    Google Scholar 

  18. Rozier, E.W.D., Belluomini, W., Deenadhayalan, V., Hafner, J., Rao, K., Zhou, P.: Evaluating the impact of undetected disk errors in raid systems. In: IEEE/IFIP International Conference on Dependable Systems Networks, DSN 2009, pp. 83–92, June 2009

    Google Scholar 

Download references

Acknowledgements

We would like to thank DDN for making an IME test system available at Jülich Supercomputing Centre. In particular, we gracefully acknowledge the continuous support by Tommaso Cecchi and Toine Beckers.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wolfram Schenck .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Schenck, W., El Sayed, S., Foszczynski, M., Homberg, W., Pleiter, D. (2016). Early Evaluation of the “Infinite Memory Engine” Burst Buffer Solution. In: Taufer, M., Mohr, B., Kunkel, J. (eds) High Performance Computing. ISC High Performance 2016. Lecture Notes in Computer Science(), vol 9945. Springer, Cham. https://doi.org/10.1007/978-3-319-46079-6_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-46079-6_41

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-46078-9

  • Online ISBN: 978-3-319-46079-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics