Double Buffering for MCDRAM on Second Generation $$\hbox {Intel}^{\circledR }$$ Xeon Phi $$^{\text {TM}}$$ Processors with OpenMP

Olivier, Stephen L.; Hammond, Simon D.; Duran, Alejandro

doi:10.1007/978-3-319-65578-9_21

Stephen L. Olivier¹⁸,
Simon D. Hammond¹⁸ &
Alejandro Duran¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 10468))

Included in the following conference series:

International Workshop on OpenMP

996 Accesses
1 Citations

Abstract

Emerging novel architectures for shared memory parallel computing are incorporating increasingly creative innovations to deliver higher memory performance. A notable exemplar of this phenomenon is the Multi-Channel DRAM (MCDRAM) that is included in the $\hbox {Intel}^{\circledR }$ XeonPhi$^{\text {TM}}$ processors. In this paper, we examine techniques to use OpenMP to exploit the high bandwidth of MCDRAM by staging data. In particular, we implement double buffering using OpenMP sections and tasks to explicitly manage movement of data into MCDRAM. We compare our double-buffered approach to a non-buffered implementation and to Intel’s cache mode, in which the system manages the MCDRAM as a transparent cache. We also demonstrate the sensitivity of performance to parameters such as dataset size and the distribution of threads between compute and copy operations.

(“The rights of this work are transferred to the extent transferable according to title 17 § 105 U.S.C.”).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Optimizing Memory Bandwidth in Knights Landing on Stream Triad. https://software.intel.com/en-us/articles/optimizing-memory-bandwidth-in-knights-landing-on-stream-triad
Bauer, M., Cook, H., Khailany, B.: CudaDMA: Optimizing GPU memory bandwidth via warp specialization. In: 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC11), pp. 12:1–12:11. ACM (2011)
Google Scholar
Cantalupo, C., Venkatesan, V., Hammond, J., Czurylo, K., Hammond, S.: Memkind: an extensible heap memory manager for heterogeneous memory platforms and mixed memory policies. http://memkind.github.io/memkind/memkind_arch_20150318.pdf
Chen, T., Sura, Z., O’Brien, K., O’Brien, J.K.: Optimizing the use of static buffers for DMA on a CELL chip. In: Almási, G., Cascaval, C., Wu, P. (eds.) LCPC 2006. LNCS, vol. 4382, pp. 314–329. Springer, Heidelberg (2007). doi:10.1007/978-3-540-72521-3_23
Dokulil, J., Bajrovic, E., Benkner, S., Sandrieser, M., Bachmayer, B.: HyPHI - task based hybrid execution C++ library for the intel xeon phi coprocessor. In: 2013 International Conference on Parallel Processing, pp. 280–289 (2013)
Google Scholar
Liu, F., Chaudhary, V.: Extending OpenMP for heterogeneous chip multiprocessors. In: 2003 International Conference on Parallel Processing, pp. 161–168, October 2003
Google Scholar
OpenMP Architecture Review Board: OpenMP application programming interface, version 4.5. http://www.openmp.org/wp-content/uploads/openmp-4.5.pdf
OpenMP Architecture Review Board: OpenMP technical report 5: memory management support for OpenMP 5.0. http://www.openmp.org/wp-content/uploads/openmp-TR5-final.pdf
Perez, J.M., Bellens, P., Badia, R.M., Labarta, J.: CellSs: making it easier to program the cell broadband engine processor. IBM J. Res. Dev. 51(5), 593–604 (2007)
Article Google Scholar
Sancho, J.C., Kerbyson, D.J.: Analysis of double buffering on two different multicore architectures: quad-core opteron and the Cell-BE. In: 2008 IEEE International Symposium on Parallel and Distributed Processing, pp. 1–12, April 2008
Google Scholar
Sewall, J., Pennycook, S., Duran, A., Tian, X., Narayanaswamy, R.: A modern memory management system for OpenMP. In: Third International Workshop on Accelerator Programming Using Directives, pp. 25–35. IEEE Press (2016)
Google Scholar
Sodani, A., Gramunt, R., Corbal, J., Kim, H.S., Vinod, K., Chinthamani, S., Hutsell, S., Agarwal, R., Liu, Y.C.: Knights landing: second-generation intel xeon phi product. IEEE Micro 36(2), 34–46 (2016)
Article Google Scholar
Spafford, K., Meredith, J., Vetter, J.: Maestro: data orchestration and tuning for OpenCL devices. In: DÁmbra, P., Guarracino, M., Talia, D. (eds.) Euro-Par 2010. LNCS, vol. 6272, pp. 275–286. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15291-7_26
Chapter Google Scholar

Download references

Acknowledgments

Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525. We wish to acknowledge our appreciation for the use of the Advanced Architecture Test Bed, Bowman, at Sandia National Laboratories. The test beds are provided by NNSA’s Advanced Simulation and Computing (ASC) program for research and development of advanced architectures for exascale computing.

Disclaimers: Intel, Xeon, and Xeon Phi are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

* Other brands and names are the property of their respective owners.

Author information

Authors and Affiliations

Center for Computing Research, Sandia National Laboratories, Albuquerque, USA
Stephen L. Olivier & Simon D. Hammond
Intel Corporation Iberia, Madrid, Spain
Alejandro Duran

Authors

Stephen L. Olivier
View author publications
You can also search for this author in PubMed Google Scholar
Simon D. Hammond
View author publications
You can also search for this author in PubMed Google Scholar
Alejandro Duran
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Stephen L. Olivier .

Editor information

Editors and Affiliations

Lawrence Livermore National Laboratory, Livermore, California, USA
Bronis R. de Supinski
Sandia National Laboratories, Albuquerque, New Mexico, USA
Stephen L. Olivier
RWTH Aachen University, Aachen, Germany
Christian Terboven
Stony Brook University, Stony Brook, New York, USA
Barbara M. Chapman
RWTH Aachen University, Aachen, Germany
Matthias S. Müller

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Olivier, S.L., Hammond, S.D., Duran, A. (2017). Double Buffering for MCDRAM on Second Generation $\hbox {Intel}^{\circledR }$ Xeon Phi$^{\text {TM}}$ Processors with OpenMP. In: de Supinski, B., Olivier, S., Terboven, C., Chapman, B., Müller, M. (eds) Scaling OpenMP for Exascale Performance and Portability. IWOMP 2017. Lecture Notes in Computer Science(), vol 10468. Springer, Cham. https://doi.org/10.1007/978-3-319-65578-9_21

Download citation

DOI: https://doi.org/10.1007/978-3-319-65578-9_21
Published: 17 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-65577-2
Online ISBN: 978-3-319-65578-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Double Buffering for MCDRAM on Second Generation \(\hbox {Intel}^{\circledR }\) Xeon Phi\(^{\text {TM}}\) Processors with OpenMP

Abstract

Access this chapter

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Double Buffering for MCDRAM on Second Generation \(\hbox {Intel}^{\circledR }\) Xeon Phi\(^{\text {TM}}\) Processors with OpenMP

Abstract

Access this chapter

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation