Skip to main content

Low-Overhead, High-Speed Multi-core Barrier Synchronization

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5952))

Abstract

Whereas efficient barrier implementations were once a concern only in high-performance computing, recent trends in core integration make the topic relevant even for general-purpose CMPs. While the nature of CMP applications requires low-latency, the cost of low-latency barrier implementations using hardware-based techniques can be prohibitive for CMPs, where die area represents opportunities for throughput and yield. Similarly, whereas traditional multiprocessor barrier implementations were developed primarily for dedicated environments, scheduling and multi-programming on CMPs require more adaptable barrier implementations.

In this paper, we present and evaluate three barrier implementations that are hybrids of software and dedicated hardware barriers and are specifically tailored for CMPs. The implementations leverage the unique characteristics of CMPs and provide low latency comparable to that of dedicated hardware networks at a fraction of the cost. The implementations also support adaptability, enabling efficient multi-programming and dynamic remapping of the barrier network.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Shang, S., Hwang, K.: Distributed hardwired barrier synchronization for scalable multiprocessor clusters. IEEE Trans. Parallel Distrib. Syst. 6(6), 591–605 (1995)

    Article  Google Scholar 

  2. Hoefler, T.: A survey of barrier algorithms for coarse grained supercomputers. Chemnitzer Informatik-Berichte (2004)

    Google Scholar 

  3. Almási, G., et al.: Optimization of MPI collective communication on Bluegene/L systems. In: ICS 2005, pp. 253–262 (2005)

    Google Scholar 

  4. Ramakrishnan, V., Scherson, I.D.: Efficient techniques for nested and disjoint barrier synchronization. J. Parallel Distrib. Comput. 58(2), 333–356 (1999)

    Article  Google Scholar 

  5. Chen, J., Watson, W.: Software barrier performance on dual quad-core Opterons. In: NAS 2008, pp. 303–309 (2008)

    Google Scholar 

  6. Nikolopoulos, D., Papatheodorou, T.: Fast synchronization on scalable cache-coherent multiprocessors using hybrid primitives. In: IPDPS 2000, p. 711 (2000)

    Google Scholar 

  7. Lee, J.B., Jhon, C.S.: Reducing coherence overhead of barrier synchronization in software DSMs. In: ICS 1998, pp. 1–18 (1998)

    Google Scholar 

  8. Mellor-Crummey, J.M., Scott, M.L.: Algorithms for scalable synchronization on shared-memory multiprocessors. ACM Trans. Comput. Syst. 9(1), 21–65 (1991)

    Article  Google Scholar 

  9. Coteus, P., et al.: Packaging the BlueGene/L supercomputer. IBM Journal of Research and Development 49(2-3), 213–248 (2005)

    Article  Google Scholar 

  10. Adams, D.: Cray T3D system architecture overview manual (1993), ftp://ftp.cray.com/product-info/mpp/T3D_Architecture_Over/T3D.overview.html

  11. Freudenthal, E., Peze, O.: Efficient synchronization algorithms using fetch-and-add on multiple bitfield integers. Ultracomputer Note 148 (1988)

    Google Scholar 

  12. Beckmann, C., Polychronopoulos, C.: Fast barrier synchronization hardware. In: ICS 1990, pp. 180–189 (1990)

    Google Scholar 

  13. Biswas, R.: NAS parallel benchmarks (2009), http://www.nas.nasa.gov

  14. Kumar, R., Zyuban, V., Tullsen, D.: Interconnections in multi-core architectures: Understanding mechanisms, overheads, and scaling. In: ISCA 2005 (2005)

    Google Scholar 

  15. Althaus, E., Funke, S., Har-peled, S., Knemann, J.: Approximating k-hop minimum-spanning trees. Operations Research Letters 33, 120 (2005)

    Article  Google Scholar 

  16. Kumar, A., et al.: Express virtual channels: Towards the ideal interconnection fabric. SIGARCH Comput. Archit. News 35(2), 150–161 (2007)

    Article  Google Scholar 

  17. Binkert, N.L., et al.: The M5 simulator: Modeling networked systems. MICRO 26(4), 52–60 (2006)

    Google Scholar 

  18. Sampson, J., et al.: Exploiting fine-grained data parallelism with chip multiprocessors and fast barriers. MICRO 39, 235–246 (2006)

    Google Scholar 

  19. McMahon, F.: Livermore loops coded in C (1992), http://www.netlib.org/benchmark/livermorec

  20. E.M.B. Consortium: EEMBC (2009), http://www.eembc.org

  21. Zhu, W., et al.: Synchronization state buffer: Supporting efficient fine-grain synchronization on many-core architectures. In: ISCA 2007, pp. 35–45 (2007)

    Google Scholar 

  22. Villa, O., Palermo, G., Silvano, C.: Efficiency and scalability of barrier synchronization on NOC based many-core architectures. In: CASES 2008, pp. 81–90 (2008)

    Google Scholar 

  23. Scott, S.L.: Synchronization and communication in the T3E multiprocessor. SIGOPS Oper. Syst. Rev. 30(5), 26–36 (1996)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sartori, J., Kumar, R. (2010). Low-Overhead, High-Speed Multi-core Barrier Synchronization . In: Patt, Y.N., Foglia, P., Duesterwald, E., Faraboschi, P., Martorell, X. (eds) High Performance Embedded Architectures and Compilers. HiPEAC 2010. Lecture Notes in Computer Science, vol 5952. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11515-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-11515-8_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-11514-1

  • Online ISBN: 978-3-642-11515-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics