Hobbes: A Multi-kernel Infrastructure for Application Composition

Kocoloski, Brian; Lange, John; Pedretti, Kevin; Brightwell, Ron

doi:10.1007/978-981-13-6624-6_15

Brian Kocoloski⁶,
John Lange⁷,
Kevin Pedretti⁸ &
…
Ron Brightwell⁸

Part of the book series: High-Performance Computing Series ((HPC,volume 1))

1067 Accesses
2 Citations

Abstract

This chapter describes the Hobbes OS/R environment, which was designed to support the construction of sophisticated application compositions across multiple system software stacks called enclaves. The core idea of the approach is to enable each application component to execute in the system software environment that best matches its requirements. Hobbes then provides a set of cross-enclave composition mechanisms enabling the individual components to work together as part of a larger application workflow. Unique aspects of Hobbes compared to other multi-kernels include its emphasis on supporting application composition, its focus on providing cross-enclave performance isolation, and its use of hardware virtualization to enable the use of arbitrary OS/Rs. In particular, Hobbes leverages distributed, user-level resource management and hardware virtualization to allow underlying OS kernels to be largely agnostic of the multi-kernel environment, making it straightforward to add support for new OS kernels to Hobbes. We demonstrate Hobbes using a modern Cray XC30m machine, showing the generality of OS/R configurations it supports, as well as its ability to leverage existing unmodified HPC system management tools.

This contribution has been co-authored by Sandia National Laboratories, a multimission laboratory managed and operated by National Technology & Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525, and by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the contribution for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, and worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 149.00; Price excludes VAT (USA)

Hardcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In Linux, address space layouts for the shadow processes can be set with the “–mcmodel” and “-pie” parameters to the gcc compiler.

References

Alverson, B., Froese, E., Kaplan, L., & Roweth, D. (2012). Cray Inc., white paper WP-Aries01-1112. Technical report, Cray Inc.
Google Scholar
Boehme, D., Gamblin, T., Beckingsale, D., Bremer, P.-T., Gimenez, A., LeGendre, M., et al. (2016). Caliper: Performance introspection for HPC software stacks. In Proceedings of the 29th ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, (SC).
Google Scholar
Brightwell, R., Oldfield, R., Maccabe, A. B., & Bernholdt, D. E. (2013). Hobbes: Composition and virtualization as the foundations of an extreme-scale OS/R. In Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers, ROSS ’13 (pp. 2:1–2:8).
Google Scholar
Dayal, J., Bratcher, D., Eisenhauer, G., Schwan, K., Wolf, M., Zhang, X., et al. (2014). Flexpath: Type-based publish/subscribe system for large-scale science analytics. In Proceedings of the 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, (CCGrid).
Google Scholar
Dongarra, J., Heroux, M. A., & Luszczek, P. (2015). HPCG benchmark: A new metric for ranking high performance computing systems. Technical Report UT-EECS-15-736, University of Tennessee, Electrical Engineering and Computer Science Department.
Google Scholar
Gerofi, B., Takagi, M., Hori, A., Nakamura, G., Shirasawa, T., & Ishikawa, Y. (2016). On the scalability, performance isolation, and device driver transparency of the IHK/McKernel hybrid lightweight kernel. In Proceedings of the 30th IEEE International Parallel and Distributed Processing Symposium, (IPDPS).
Google Scholar
Giampapa, M., Gooding, T., Inglett, T., & Wisniewski, R. (2010). Experiences with a lightweight supercomputer kernel: Lessons learned from Blue Gene’s CNK. In 2010 International Conference for High Performance Computing, Networking, Storage and Analysis (SC).
Google Scholar
Hale, K., & Dinda, P. (2015). A case for transforming parallel runtimes into operating system kernels. In Proceedings of the 24th International ACM Symposium on High Performance Parallel and Distributed Computing, (HPDC).
Google Scholar
Hale, K., Hetland, C., & Dinda, P. (2016). Automatic hybridization of runtime systems. In Proceedings of the 25th International ACM Symposium on High Performance Parallel and Distributed Computing, (HPDC).
Google Scholar
Intel Corporation. (2018). IMB: Intel MPI Benchmarks. https://software.intel.com/en-us/articles/intel-mpi-benchmarks.
Kaplan, L. (2007). Cray CNL. In FastOS PI Meeting and Workshop.
Google Scholar
Karo, M., Lagerstrom, R., Kohnke, M., & Albing, C. (2006). The application level placement scheduler. In Proceedings of the Cray User Group Meeting.
Google Scholar
Kelly, S., Dyke, J. V., & Vaughan, C. (2008). Catamount N-Way (CNW): An implementation of the Catamount light weight kernel supporting N-cores version 2.0. Technical report, Sandia National Laboratories.
Google Scholar
Kocoloski, B., & Lange, J. (2014). HPMMAP: Lightweight memory management for commodity operating systems. In Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, IPDPS ’14 (pp. 649–658). Washington, DC, USA: IEEE Computer Society.
Google Scholar
Kocoloski, B., & Lange, J. (2015). XEMEM: Efficient shared memory for composed applications on multi-OS/R exascale systems. In Proceedings of the 24th International ACM Symposium on High Performance Parallel and Distributed Computing, (HPDC).
Google Scholar
Kocoloski, B., Lange, J., Abbasi, H., Bernholdt, D., Jones, T., Dayal, J., et al. (2015). System-level support for composition of application. In Proceedings of the 5th International Workshop on Runtime and Operating Systems for Supercomputers, (ROSS).
Google Scholar
Lange, J., Pedretti, K., Hudson, T., Dinda, P., Cui, Z., Xia, L., et al. (2010). Palacios and Kitten: New high performance operating systems for scalable virtualized and native supercomputing. In Proceedings of the 24th IEEE International Parallel and Distributed Processing Symposium, (IPDPS).
Google Scholar
Lange, J., Pedretti, K., Dinda, P., Bridges, P., Soltero, C. B. P., & Merritt, A. (2011). Minimal-overhead virtualization of a large scale supercomputer. In Proceedings of the 7th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, (VEE).
Google Scholar
Liu, R., Klues, K., Bird, S., Hofmeyr, S., Asanovic, K., & Kubiarowicz, J. (2009). Tessellation: Space-time partitioning in a manycore client OS. In Proceedings of the 1st USENIX Conference on Hot Topics in Parallelism, (HotPar).
Google Scholar
Lofstead, J., Zheng, F., Klasky, S., & Schwan, K. (2009). Adaptable, metadata rich IO methods for portable high performance IO. In Proceedings of the 23rd IEEE International Parallel and Distributed Processing Symposium, (IPDPS).
Google Scholar
Meuer, H., Strohmaier, E., Dongarra, J., & Simon, H. (2005). Top500 supercomputer sites. www.top500.org.
Ouyang, J., Kocoloski, B., Lange, J., & Pedretti, K. (2015). Achieving performance isolation with lightweight co-kernels. In Proceedings of the 24th International ACM Symposium on High Performance Parallel and Distributed Computing, (HPDC).
Google Scholar
Petitet, A., & Cleary, A. (2008). HPL: A portable implementation of the high-performance linpack benchmark for distributed-memory computers. http://www.netlib.org/benchmark/hpl/.
Rhoden, B., Klues, K., Zhu, D., & Brewer, E. (2011). Improving per-node efficiency in the datacenter with new OS abstractions. In Proceedings of the 2nd ACM Symposium on Cloud Computing, (SOCC).
Google Scholar
Riesen, R., & Maccabe, A. B. (2011). Single system image. In D. A. Padua (Ed.), Encyclopedia of parallel computing (pp. 1820–1827). New York: Springer.
Google Scholar
Slattery, S., Wilson, P. P., & Pawlowski, R. (2013). The data transfer kit: a geometric rendezvous-based tool for multiphysics data transfer. In Proceedings of the International Conference on Mathematics & Computational Methods Applied to Nuclear Science & Engineering, (M&C).
Google Scholar
WhiteDB. (2017). Whitedb. http://whitedb.org.
Wisniewski, R. W., Inglett, T., Keppel, P., Murty, R., & Riesen, R. (2014). mOS: An architecture for extreme-scale operating systems. In Proceedings of the 4th International Workshop on Runtime and Operating Systems for Supercomputers, ROSS ’14 (pp. 2:1–2:8). New York, NY, USA: ACM.
Google Scholar
Woodacre, M., Robb, D., Roe, D., & Feind, K. (2003). The SGI Altix 3000 global shared-memory architecture. Technical report, Silicon Graphics International Corporation.
Google Scholar
Zheng, F., Yu, H., Hantas, C., Wolf, M., Eisenhauer, G., Schwan, K., et al. (2013). GoldRush: Resource efficient in situ scientific data analytics using fine-grained interference aware execution. In Proceedings of the 26th ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, (SC).
Google Scholar
Zounmevo, J., Perarnau, S., Iskra, K., Yoshii, K., Giososa, R., Essen, B. V., et al. (2015). A container-based approach to OS specialization for exascale computing. In Proceedings of the 1st Workshop on Containers, (WoC).
Google Scholar

Download references

Author information

Authors and Affiliations

Washington University in St. Louis, St. Louis, MO, USA
Brian Kocoloski
University of Pittsburgh, Pittsburgh, PA, USA
John Lange
Sandia National Laboratories, Albuquerque, NM, USA
Kevin Pedretti & Ron Brightwell

Authors

Brian Kocoloski
View author publications
You can also search for this author in PubMed Google Scholar
John Lange
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Pedretti
View author publications
You can also search for this author in PubMed Google Scholar
Ron Brightwell
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Brian Kocoloski .

Editor information

Editors and Affiliations

RIKEN Center for Computational Science, Kobe, Japan
Balazs Gerofi
RIKEN Center for Computational Science, Kobe, Japan
Yutaka Ishikawa
Intel Corp., Oregon, OR, USA
Rolf Riesen
Intel Corp., New York, NY, USA
Robert W. Wisniewski

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kocoloski, B., Lange, J., Pedretti, K., Brightwell, R. (2019). Hobbes: A Multi-kernel Infrastructure for Application Composition. In: Gerofi, B., Ishikawa, Y., Riesen, R., Wisniewski, R.W. (eds) Operating Systems for Supercomputers and High Performance Computing. High-Performance Computing Series, vol 1. Springer, Singapore. https://doi.org/10.1007/978-981-13-6624-6_15

Download citation

DOI: https://doi.org/10.1007/978-981-13-6624-6_15
Published: 16 October 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-6623-9
Online ISBN: 978-981-13-6624-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics