Abstract
Composition of computational science applications into both ad hoc pipelines for analysis of collected or generated data and into well-defined and repeatable workflows is becoming increasingly popular. Meanwhile, dedicated high performance computing storage environments are rapidly becoming more diverse, with both significant amounts of non-volatile memory storage and mature parallel file systems available. At the same time, computational science codes are being coupled to data analysis tools which are not filesystem-oriented. In this paper, we describe how the FAODEL data management service can expose different available data storage options and mediate among them in both application- and FAODEL-directed ways. These capabilities allow applications to exploit their knowledge of the different types of data they may exchange during a workflow execution, and also provide FAODEL with mechanisms to proactively tune data storage behavior when appropriate. We describe the implementation of these capabilities in FAODEL and how they are used by applications, and present preliminary performance results demonstrating the potential benefits of our approach.
SAND2019-6668C.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
The OpenACC application programming interface, November 2018. http://openacc-standard.org
Adjie-Winoto, W., Schwartz, E., Balakrishnan, H., Lilley, J.: The design and implementation of an intentional naming system. In: Proceedings of the Seventeenth ACM Symposium on Operating Systems Principles, SOSP 1999, pp. 186–201. ACM, New York (1999). https://doi.org/10.1145/319151.319164
Ayachit, U., et al.: The SENSEI generic in situ interface. In: Workshop on In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization (ISAV), pp. 40–44. IEEE (2016)
Bauer, M., Treichler, S., Slaughter, E., Aiken, A.: Legion: expressing locality and independence with logical regions. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, p. 66. IEEE Computer Society Press (2012)
Bustamante, F., Widener, P., Schwan, K.: Scalable directory services using proactivity. In: Proceedings 2002 ACM/IEEE Conference on Supercomputing. ACM/IEEE, Baltimore, November 2002
Dong, B., et al.: Data elevator: low-contention data movement in hierarchical storage system. In: 2016 IEEE 23rd International Conference on High Performance Computing (HiPC), pp. 152–161. IEEE (2016)
Edwards, H.C., Trott, C.R., Sunderland, D.: Kokkos: enabling manycore performance portability through polymorphic memory access patterns. J. Parallel Distrib. Comput. 74(12), 3202–3216 (2014). https://doi.org/10.1016/j.jpdc.2014.07.003. http://www.sciencedirect.com/science/article/pii/S0743731514001257. Domain-Specific Languages and High-Level Frameworks for High-Performance Computing
The Apache Software Foundation: Apache cassandra (2018). https://cassandra.apache.org/. Accessed 10 May 2018
The Apache Software Foundation: Apache spark - unified analytics engine for big data (2018). https://spark.apache.org/. Accessed 10 May 2018
Germain, J.D.d.S., McCorquodale, J., Parker, S.G., Johnson, C.R.: Uintah: a massively parallel problem solving environment. In: 2000 Proceedings the Ninth International Symposium on High-Performance Distributed Computing, pp. 33–41. IEEE (2000)
Ghemawat, S., Dean, J.: LevelDB, a fast and lightweight key/value database library by Google (2014)
google: Github - google/leveldb: Leveldb is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values (2018). https://github.com/google/leveldb. Accessed 10 May 2018
Jones, T., et al.: Unity: unified memory and file space. In: Proceedings of the 7th International Workshop on Runtime and Operating Systems for Supercomputers (ROSS 2017), p. 6. ACM (2017)
Kale, L.V., Krishnan, S.: Charm++: a portable concurrent object oriented system based on C++. ACM SIGPLAN Not. 28, 91–108 (1993)
Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. SIGOPS Oper. Syst. Rev. 44(2), 35–40 (2010). https://doi.org/10.1145/1773912.1773922
Moody, A., Bronevetsky, G., Mohror, K., de Supinski, B.R.: Design, modeling, and evaluation of a scalable multi-level checkpointing system. In: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–11. IEEE Computer Society (2010)
Pavlo, A., et al.: A comparison of approaches to large-scale data analysis. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, pp. 165–178. ACM (2009)
Pébaÿ, P., et al.: Towards asynchronous many-task in situ data analysis using legion. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, pp. 1033–1037. IEEE (2016)
Ulmer, C., et al.: Faodel: data management for next-generation application workflows. In: Proceedings of the 9th Workshop on Scientific Cloud Computing, p. 8. ACM (2018)
Ulmer, C., et al.: Faodel: data management for next-generation application workflows. In: Proceedings 9th Workshop on Scientific Cloud Computing, Science Cloud 2018. ACM, June 2018
Vahdat, A., Dahlin, M., Anderson, T., Aggarwal, A.: Active names: flexible location and transport of wide-area resources. In: Proceedings USENIX Symposium on Internet Technology and Systems, October 1999
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Widener, P., Ulmer, C., Levy, S., Kordenbrock, T., Templet, G. (2019). Mediating Data Center Storage Diversity in HPC Applications with FAODEL. In: Weiland, M., Juckeland, G., Alam, S., Jagode, H. (eds) High Performance Computing. ISC High Performance 2019. Lecture Notes in Computer Science(), vol 11887. Springer, Cham. https://doi.org/10.1007/978-3-030-34356-9_22
Download citation
DOI: https://doi.org/10.1007/978-3-030-34356-9_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-34355-2
Online ISBN: 978-3-030-34356-9
eBook Packages: Computer ScienceComputer Science (R0)