Skip to main content

Approaches for Memory-Efficient Communication Library and Runtime Communication Optimization

  • Chapter
  • First Online:
  • 417 Accesses

Abstract

This article summarizes the works established in Advanced Communication for Exa (ACE) project. The most important motivation of this project was the severe demands for scalable communication toward Exa-scale computations. Therefore, in the project, we have built a PGAS-based communication library, Advanced Communication Primitives (ACP). Its fundamental communication model is one-sided, based on PGAS model, so that it can consume internal memory footprint as small as possible. Based on this model, several applications including simulations of magnetohydrodynamic, molecular orbitals, and particles were tuned to achieve higher scalability. In addition to that, some communication optimization techniques have been investigated. Especially, tuning methods of collective communications, such as message ordering, algorithm selection, and overlapping, are studied. Also, in this project, a network simulator NSIM-ACE is developed. It simulates behavior of packets for one-sided communications to study the effects of congestions on interconnects.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Ajima, Y.: Reducing manipulation overhead of remote data structure by controlling remote memory access order. In: ExaComm 2016 Workshop, Frankfurt, Germany, 23 June 2016. https://doi.org/10.1007/978-3-319-46079-6_7

    Google Scholar 

  2. Ajima, Y., Nose, T., Saga, K., Shida, N., Sumimoto, S.: ACPdl: data-structure and global memory allocator library over a thin PGAS-layer. In: Proceedings of the First International Workshop on Extreme Scale Programming Models and Middleware, pp. 11–18 (2015)

    Google Scholar 

  3. Fukazawa, K., Nanri, T.: Performance of large scale MHD simulation of global planetary magnetosphere with massively parallel scalar type supercomputer including post processing. In: Proceedings of 14th IEEE International Conference on High Performance Computing and Communication, pp. 976–982, Liverpool, United Kingdom, Jun 2012. https://doi.org/10.1109/HPCC.2012.142

  4. Fukazawa, K., Nanri, T., Umeda, T.: Performance evaluation of magnetohydrodynamics simulation for magnetosphere on K computer. In: Tan, G., Yeo, G.K., Turner, S.J., Teo, Y.M. (eds.) AsiaSim 2013, Communications in Computer and Information Science, vol. 402, pp. 570–576. Springer, Berlin/Heidelberg (2013). ISBN: 978-3-642-45036-5. https://doi.org/10.1007/978-3-642-45037-2_61

    Chapter  Google Scholar 

  5. Fukazawa, K., Nanri, T., Umeda, T.: Performance measurements of MHD simulation for planetary magnetosphere on peta-scale computer FX10. In: Parallel Computing: Accelerating Computational Science and Engineering (CSE). Advances in Parallel Computing, vol. 25, pp. 387–394. IOS Press (2014). https://doi.org/10.3233/978-1-61499-381-0-387

  6. Honda, H.: Performance evaluation of Hartree-Fock program developed by ruby scripting language. In: 1st Pan-American Congress on Computational Mechanics (PANACM 2015), Apr 2015

    Google Scholar 

  7. Honda, H.: Development of ACP middle layer communication library for molecular orbital calculation. In: International Congress of Quantum Chemistry 2015 Satellite Symposium, June 2015

    Google Scholar 

  8. Honda, H., Morie, Y., Nanri, T.: Development of a memory efficient communication method for connecting MPI programs by using ACP library. In: The 35th JSST Annual Conference International Conference on Simulation Technology, Kyoto, Japan, 27–29 Oct 2016

    Google Scholar 

  9. Kobayashi, T.: A new bottleneck in large-scale numerical simulations of transient phenomena, and cooperation between simulations and the post-processes. In: 1st Pan-American Congress on Computational Mechanics (PANACM 2015), Apr 2015

    Google Scholar 

  10. Morie, Y.: Implement and evaluation of ACP basic layer of InfiniBand. In: International Workshop on Information Technology. Applied Mathematics and Science (IMS 2015), Kyoto, Japan, Mar 2015

    Google Scholar 

  11. Morie, Y., Nanri, T.: Task allocation optimization for neighboring communication on fat tree. In: 14th IEEE International Conference on High Performance Computing and Communication 9th IEEE International Conference on Embedded Software and Systems, HPCC-ICESS 2012, pp. 1219–1225, Liverpool, UK, 25–27 June 2012

    Google Scholar 

  12. Morie, Y., Nanri, T.: Neighbor communication algorithm with making an effective use of NICs on multidimensional-mesh/torus. In: International Conference on Simulation Technology (JSST2013), Tokyo, Sep 2013

    Google Scholar 

  13. Nanri, T.: Channel interface: a primitive model for memory efficient communication. In: 23rd Euromicro International Conference on Parallel, Distributed and Network-Based Processing, Turku, Finland, Feb 2015

    Google Scholar 

  14. Nanri, T.: Performance and memory usage evaluations for channel interface of advanced communication primitives library. In: 1st Pan-American Congress on Computational Mechanics (PANACM 2015), Apr 2015

    Google Scholar 

  15. Nanri, T., Fukazawa, K.: Effect of overlapping halo exchange with one-sided communication. In: the 35th JSST Annual Conference International Conference on Simulation Technology, Oct 2016

    Google Scholar 

  16. Nanri, T., Kurokawa, M.: Efficient runtime algorithm selection of collective communication with topology-based performance models. In: International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA)’12, Las Vegas, 16–19 July 2012

    Google Scholar 

  17. Shibamura, H.: Active packet pacing as a congestion avoidance technique in interconnection network. In: International Conference on Parallel Computing 2015 (ParCo 2015), Sept 2015

    Google Scholar 

  18. Sumimoto, S., Ajima, Y., Saga, K., Nose, T., Shida, N., Nanri, T.: The design of advanced communication to reduce memory usage for exa-scale systems. In: Proceedings of 12th International Meeting on High Performance Computing for Computational Science, Porto, Portugal, 28–30 June 2016, to be published as Springer’s Lecture Notes in Computer Science (LNCS)

    Google Scholar 

  19. Sumimoto, S., Ajima, Y., Nose, T., Saga, K., Shida, N., Yoshiyuki, M., Nanri, T.: Parallel application experiences using advanced communication primitives. In: 25th Euromicro International Conference on Parallel, Distributed and network-based Processing (PDP 2017), 6–8 Mar 2017

    Google Scholar 

  20. Susukita, R., Morie, Y., Nanri, T., Shibamura, H.: Performance Evaluation of RDMA Communication Patterns by Means of Simulations. In: 2015 Joint International Mechanical, Electronic and Information Technology Conference, Dec 2015

    Google Scholar 

  21. Susukita, R., Morie, Y., Nanri, T., Shibamura, H.: NSIM-ACE: an interconnection network simulator for evaluating remote direct memory access. In: Proceedings of 6th International Conference on Simulation and Modeling Methodologies, Technologies and Applications (SIMULTECH 2016) (2016)

    Google Scholar 

  22. Susukita, R., Morie, Y., Nanri, T.: Efficient communications of particle data in particle-based simulations. In: Proceedings of 35th JSST Annual Conference International Conference on Simulation Technology (JSST 2016) (2016)

    Google Scholar 

  23. Takami, T., Fukudome, D.: An efficient pipelined implementation of space-time parallel applications. In: Parallel Computing: Accelerating Computational Science and Engineering (CSE). Advances in Parallel Computing, vol. 25, pp. 273–281. IOS Press (2014). https://doi.org/10.3233/978-1-61499-381-0-273

  24. Takami, T., Fukudome, D.: An identity parareal method for temporal parallel computations. In: Wyrzykowski, R., Dongarra, J., Karczewski, K., Wasniewski, J. (eds.) Lecture Notes in Computer Science, vol. 8384, pp. 67–75 (2014). https://doi.org/10.1007/978-3-642-55224-3_7

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takeshi Nanri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Nanri, T. (2019). Approaches for Memory-Efficient Communication Library and Runtime Communication Optimization. In: Sato, M. (eds) Advanced Software Technologies for Post-Peta Scale Computing. Springer, Singapore. https://doi.org/10.1007/978-981-13-1924-2_7

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-1924-2_7

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-1923-5

  • Online ISBN: 978-981-13-1924-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics