Skip to main content

System Software for Many-Core and Multi-core Architecture

  • Chapter
  • First Online:
Advanced Software Technologies for Post-Peta Scale Computing

Abstract

In this project, the software technologies for the post-peta scale computing were explored. More specifically, OS technologies for heterogeneous architectures, lightweight thread, scalable I/O, and fault mitigation were investigated. As for the OS technologies, a new parallel execution model, Partitioned Virtual Address Space (PVAS), for the many-core CPU was proposed. For the heterogeneous architectures, where multi-core CPU and many-core CPU are connected with an I/O bus, an extension of PVAS, Multiple-PVAS, to have a unified virtual address space of multi-core and many-core CPUs was proposed. The proposed PVAS was also enhanced to have multiple processes where process context switch can take place at the user level (named User-Level Process: ULP). As for the scalable I/O, EARTH, optimization techniques for MPI collective I/O, was proposed. Lastly, for the fault mitigation, User Level Fault Mitigation, ULFM was improved to have faster agreement process, and sliding methods to substitute failed nodes with spare nodes was proposed. The funding of this project was ended in 2016; however, many proposed technologies are still being propelled.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Angskun, T., Bosilca, G., Dongarra, J.J.: Binomial graph: a scalable and fault-tolerant logical network topology. In: Parallel and Distributed Processing and Applications, 5th International Symposium, ISPA 2007, Niagara Falls, 29–31 Aug 2007, pp. 471–482 (2007)

    Google Scholar 

  2. Bland, W., Bouteiller, A., Herault, T., Bosilca, G., Dongarra, J.: Post-failure recovery of MPI communication capability: Design and rationale. Int. J. High Perform. Comput. Appl. 27(3), 244–254 (2013). https://doi.org/10.1177/1094342013488238

    Article  Google Scholar 

  3. Bouteiller, A., Bosilca, G., Dongarra, J.J.: Plan b: Interruption of ongoing MPI operations to support failure recovery. In: Proceedings of the 22Nd European MPI Users’ Group Meeting, EuroMPI ’15, pp. 11:1–11:9. ACM, New York (2015). https://doi.org/10.1145/2802658.2802668

  4. Brightwell, R., Pedretti, K., Hudson, T.: SMARTMAP: operating system support for efficient data sharing among processes on a multi-core processor. In: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, SC ’08, pp. 25:1–25:12. IEEE Press, Piscataway (2008). http://dl.acm.org/citation.cfm?id=1413370.1413396

  5. Fukazawa, G.: Multiple PVAS: a systems software for HPC application programs on multi-core and many-core. Master Thesis, in Japanese (2014)

    Google Scholar 

  6. Herault, T., Bouteiller, A., Bosilca, G., Gamell, M., Teranishi, K., Parashar, M., Dongarra, J.: Practical scalable consensus for pseudo-synchronous distributed systems. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’15, pp. 31:1–31:12. ACM, New York (2015) https://doi.org/10.1145/2807591.2807665

  7. Hori, A., Si, M., Gerofi, B., Takagi, M., Dayal, J., Balaji, P., Ishikawa, Y.: Process-in-process: techniques for practical address-space sharing. In: The 27th International Symposium on High-Performance Parallel and Distributed Computing (HPDC’18). ACM (2018)

    Google Scholar 

  8. Hori, A., Yoshinaga, K., Herault, T., Bouteiller, A., Bosilca, G., Ishikawa, Y.: Sliding Substitution of Failed Nodes. In: Proceedings of the 22Nd European MPI Users’ Group Meeting, EuroMPI ’15, pp. 14:1–14:10. ACM, New York (2015). https://doi.org/10.1145/2802658.2802670

  9. Matsuoka, S.: The road to tsubame and beyond. In: Resch, M., Roller, S., Lammers, P., Furui, T., Galle, M., Bez, W. (eds.) High Performance Computing on Vector Systems 2007, pp. 265–267. Springer, Berlin/Heidelberg (2008)

    Chapter  Google Scholar 

  10. Nakashima, J., Taura, K.: MassiveThreads: A Thread Library for High Productivity Languages, pp. 222–238. Springer, Berlin/Heidelberg (2014). https://doi.org/10.1007/978-3-662-44471-9_10

    Google Scholar 

  11. Pérache, M., Jourdren, H., Namyst, R.: MPC: a unified parallel runtime for clusters of NUMA machines. In: Proceedings of the 14th International Euro-Par Conference on Parallel Processing, Euro-Par’08, pp. 78–88. Springer, Berlin/Heidelberg (2008). http://doi.org/10.1007/978-3-540-85451-7_9

  12. Reinders, J.: An Overview of Programming for Intel Xeon processors and Intel Xeon Phi coprocessors (2012)

    Google Scholar 

  13. del Rosario, J.M., Bordawekar, R., Choudhary, A.: Improved parallel i/o via a two-phase run-time access strategy. SIGARCH Comput. Archit. News 21(5), 31–38 (1993). https://doi.org/http://doi.acm.org/10.1145/165660.165667

    Article  Google Scholar 

  14. Sato, M., Fukazawa, G., Shimada, A., Hori, A., Ishikawa, Y., Namiki, M.: Design of multiple pvas on infiniband cluster system consisting of many-core and multi-core. In: Proceedings of the 21st European MPI Users’ Group Meeting, EuroMPI/ASIA ’14, pp. 133:133–133:138. ACM, New York (2014). https://doi.org/10.1145/2642769.2642795.

  15. Sato, M., Fukazawa, G., Yoshinaga, K., Tsujita, Y., Hori, A., Namiki, M.: A hybrid operating system for a computing node with multi-core and many-core processors. In: International Journal Advanced mputer Science (IJACSci), vol. 3, pp. 368–377 (2013)

    Google Scholar 

  16. Schneider, T., Gerstenberger, R., Hoefler, T.: Micro-Applications for Communication Data Access Patterns and MPI Datatypes. In: Recent Advances in the Message Passing Interface – 19th European MPI Users’ Group Meeting, EuroMPI 2012, Vienna, Austria, 23–26 Sept 2012. Proceedings, vol. 7490, pp. 121–131. Springer (2012)

    Google Scholar 

  17. Shimada, A.: A study on task models for high-performance and efficient intra-node communication in many-core environments. Ph.D. thesis, Keio University (2017). (in Japanese)

    Google Scholar 

  18. Shimada, A., Gerofi, B., Hori, A., Ishikawa, Y.: Pgas intra-node communication towards many-core architecture. In: In PGAS 2012: 6th Conference on Partitioned Global Address Space Programing Model, PGAS’12 (2012)

    Google Scholar 

  19. Shimada, A., Gerofi, B., Hori, A., Ishikawa, Y.: Proposing a new task model towards many-core architecture. In: Proceedings of the First International Workshop on Many-core Embedded Systems, MES ’13, pp. 45–48. ACM, New York (2013). https://doi.org/10.1145/2489068.2489075. http://doi.acm.org/10.1145/2489068.2489075

  20. Shimada, A., Hori, A., Ishikawa, Y.: Eliminating costs for crossing process boundary from mpi intra-node communication. In: Proceedings of the 21st European MPI Users’ Group Meeting, EuroMPI/ASIA ’14, pp. 119:119–119:120. ACM, New York (2014). https://doi.org/10.1145/2642769.2642790

  21. Shimada, A., Hori, A., Ishikawa, Y., Balaji, P.: User-level process towards exascale systems 2014(22), 1–7 (2014). http://ci.nii.ac.jp/naid/110009850784/

    Google Scholar 

  22. Shimosawa, T., Gerofi, B., Takagi, M., Shirasawa, T., Shimizu, M., Hori, A., Ishikawa, Y.: Interface for Heterogeneous Kernels: A Framework to Enable Hybrid OS Designs Tergeting High Performance Computing. In: IEEE International Conference on High Performance Computing (HiPC). IEEE (2014)

    Google Scholar 

  23. Thakur, R., Lusk, E., Gropp, W.: Users Guide for ROMIO: A High-Performance, Portable MPI-IO Implementation. Technical Memorandum ANL/MCS-TM-234, Argonne National Laboratory (2004)

    Google Scholar 

  24. Tsujita, Y., Hori, A., Ishikawa, Y.: Locality-aware process mapping for high performance collective MPI-IO on FEFS with Tofu interconnect. In: Proceedings of the 21th European MPI Users’ Group Meeting, pp. 157–162. ACM (2014). Challenges in Data-Centric Computing. https://doi.org/10.1145/2642769.2642799

  25. Tsujita, Y., Hori, A., Kameyama, T., Uno, A., Shoji, F., Ishikawa, Y.: Improving collective MPI-IO using topology-aware stepwise data aggregation with I/O throttling. In: Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, pp. 12–23. ACM (2018). https://doi.org/10.1145/3149457.3149464

  26. Yokokawa, M., Shoji, F., Uno, A., Kurokawa, M., Watanabe, T.: The K Computer: Japanese Next-Generation Supercomputer Development Project. In: Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design, ISLPED ’11, pp. 371–372. IEEE Press, Piscataway (2011). http://dl.acm.org/citation.cfm?id=2016802.2016889

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Atsushi Hori .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Hori, A. et al. (2019). System Software for Many-Core and Multi-core Architecture. In: Sato, M. (eds) Advanced Software Technologies for Post-Peta Scale Computing. Springer, Singapore. https://doi.org/10.1007/978-981-13-1924-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-1924-2_4

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-1923-5

  • Online ISBN: 978-981-13-1924-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics