Skip to main content

On the Algorithmic Aspects of Using OpenMP Synchronization Mechanisms II: User-Guided Speculative Locks

  • Conference paper
  • First Online:
OpenMP: Heterogenous Execution and Data Movements (IWOMP 2015)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 9342))

Included in the following conference series:

Abstract

In this paper we continue our investigations started in [8] into the effects of using different synchronization mechanisms in OpenMP-threaded iterative mesh optimization algorithms. We port our test code to the Intel® Xeon® processor (former codename “Haswell”) by employing a user-guided locking API for OpenMP [4] that provides a general and unified user interface and runtime framework. Since the Intel® Transactional Synchronization Extensions (TSX) provide two different options for speculation — Hardware Lock Elision (HLE) and Restricted Transactional Memory (RTM) — we compare a total of four different run modes: (i) HLE, (ii) RTM, (iii) OpenMP critical, and (iv) “unsynchronized”. As we did in [8], we find that either speculative execution option always outperforms the other two modes in terms of their convergence characteristics. Even with their higher overhead, the TSX options are very competitive when it comes to runtime performance measured with the “time-to-convergence” criterion introduced in [8].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Intel ARK. http://ark.intel.com

  2. Intel\(^{\textregistered }\) Threading Building Blocks. https://www.threadingbuildingblocks.org

  3. LLVM. http://www.llvm.org

  4. Bae, H., Cownie, J., Klemm, M., Terboven, C.: A user-guided locking API for the OpenMP* application program interface. In: DeRose, L., de Supinski, B.R., Olivier, S.L., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2014. LNCS, vol. 8766, pp. 173–186. Springer, Heidelberg (2014)

    Google Scholar 

  5. Baker, A.H., Falgout, R.D., Kolev, T.V., Yang, U.M.: Multigrid smoothers for ultraparallel computing. SIAM J. Sci. Comput. 33, 2864–2887 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  6. Bihari, B.L.: Applicability of transactional memory to modern codes. In: International Conference on Numerical Analysis and Applied Mathematics 2010 (ICNAAM 2010) Conference Proceedings, pp. 1764–1767. APS, Rodos (2010)

    Google Scholar 

  7. Bihari, B.L.: Transactional memory for unstructured mesh simulations. J. Sci. Comput. 54, 311–332 (2012)

    Article  MathSciNet  Google Scholar 

  8. Bihari, B.L., Wong, M., de Supinski, B.R., Diachin, L.: On the algorithmic aspects of using OpenMP synchronization mechanisms: the effects of transactional memory. In: DeRose, L., de Supinski, B.R., Olivier, S.L., Chapman, B.M., Müller, M.S. (eds.) IWOMP 2014. LNCS, vol. 8766, pp. 115–129. Springer, Heidelberg (2014)

    Google Scholar 

  9. Bihari, B.L., Wong, M., Wang, A., de Supinski, B.R., Chen, W.: A case for including transactions in OpenMP II: hardware transactional memory. In: Chapman, B.M., Massaioli, F., Müller, M.S., Rorro, M. (eds.) IWOMP 2012. LNCS, vol. 7312, pp. 44–58. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  10. Drepper, U., Molnar, I.: The native POSIX thread library for Linux. Technical report, Redhat (2003)

    Google Scholar 

  11. IBM Compiler Group: IBM XL C/C++ for Blue Gene/Q, V12.1 Compiler Reference (2012)

    Google Scholar 

  12. Haring, R.A., Ohmacht, M., Fox, T.W., Gschwind, M.K., Satterfield, D.L., Sugavanam, K., Coteus, P.W., Heidelberger, P., Blumrich, M.A., Wisniewski, R.W., Gara, A., Chiu, G.L.-T., Boyle, P.A., Christ, N.H., Kim, C.: The IBM blue gene/Q compute chip. IEEE Micro 32(2), 48–60 (2013)

    Article  Google Scholar 

  13. Herlihy, M., Moss, J.E.B.: Transactional memory: architectural support for lock-free data structures. SIGARCH Comput. Archit. News 51(2), 289–300 (1993)

    Article  Google Scholar 

  14. Intel Corporation: Intel\(^{\textregistered }\) Architecture Instruction Set Extensions Programming Reference. Document number 319433–014 (2012)

    Google Scholar 

  15. Intel Corporation: Intel\(^{\textregistered }\) OpenMP* Runtime Library (2015). http://www.openmprtl.org/

  16. Jacobi, C., Slegel, T., Greiner, D.: Transactional memory architecture and implementation for IBM system Z. In: 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 25–36, December 2012

    Google Scholar 

  17. Kleen, A.: Lock Elision in the GNU C library. LWN.net 12(1), (2013). http://lwn.net/Articles/534758/

  18. Knupp, P.: Hexahedral and tetrahedral mesh shape optimization. Intl. J. Numer. Meth. Engr. 58, 319–332 (2003)

    Article  MATH  Google Scholar 

  19. Le, H.Q., Guthrie, G.L., Williams, D.E., Michael, M.M., Frey, B.G., Starke, W.J., May, C., Odaira, R., Nakaike, T.: Transactional memory support in the IBM power8 processor. IBM J. Res. Dev. 59(1), 8:1–8:14 (2015)

    Article  Google Scholar 

  20. Miller, D.: The GNU C Library version 2.18 is now available. Announcement on the info-gnu mailing list (2013). http://lists.gnu.org/archive/html/info-gnu/2013-08/msg00003.html

  21. OpenMP Architecture Review Board: OpenMP Application Program Interface, Version 4.0 (2013). http://www.openmp.org/

  22. Schindewolf, M., Gyllenhaal, J., Bihari, B.L., Wang, A., Schulz, M., Karl, W.: What scientific applications can benefit from hardware transacional memory? In: International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2012 (2012)

    Google Scholar 

  23. Wang, A., Gaudet, M., Wu, P., Ohmacht, M., Amaral, J.N., Barton, C., Silvera, R., Michael, M.: Evaluation of blue gene/Q hardware support for transactional memories. In: PACT (2012)

    Google Scholar 

  24. Wong, M., Bihari, B.L., de Supinski, B.R., Wu, P., Michael, M., Liu, Y., Chen, W.: A case for including transactions in OpenMP. In: Sato, M., Hanawa, T., Müller, M.S., Chapman, B.M., de Supinski, B.R. (eds.) IWOMP 2010. LNCS, vol. 6132, pp. 149–160. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

Download references

Acknowledgments

The authors thank Trent E. D’Hooge of Livermore Computing for his assistance with our inquiries and in accommodating our runs on the local compute nodes.

Intel and Xeon are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.

* Other names and brands are the property of their respective owners.

Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance.

Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Terboven .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Bihari, B.L., Bae, H., Cownie, J., Klemm, M., Terboven, C., Diachin, L. (2015). On the Algorithmic Aspects of Using OpenMP Synchronization Mechanisms II: User-Guided Speculative Locks. In: Terboven, C., de Supinski, B., Reble, P., Chapman, B., Müller, M. (eds) OpenMP: Heterogenous Execution and Data Movements. IWOMP 2015. Lecture Notes in Computer Science(), vol 9342. Springer, Cham. https://doi.org/10.1007/978-3-319-24595-9_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24595-9_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24594-2

  • Online ISBN: 978-3-319-24595-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics