Skip to main content

Support for Fine-Grained Synchronization in Shared-Memory Multiprocessors

  • Conference paper
Parallel Computing Technologies (PaCT 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4671))

Included in the following conference series:

Abstract

It has been already verified that hardware-supported finegrain synchronization provides a significant performance improvement over coarse-grained synchronization mechanisms, such as barriers. Support for fine-grain synchronization on individual data items becomes notably important in order to efficiently exploit thread-level parallelism available on multi-threading and multi-core processors. Fine-grained synchronization can be achieved using the full/empty tagged shared memory. We define the complete set of synchronizing memory instructions as well as the architecture of the full/empty tagged shared memory that provides support for these operations. We develop a snoopy cache coherency protocol for an SMP with the centralized full/empty tagged memory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agarwal, et al.: The MIT Alewife machine: architecture and performance. In: ISCA 1995. Proceedings of the 22nd Annual International Symposium on Computer Architecture, Margherita Ligure, Italy, pp. 2–13. ACM Press, New York (1995)

    Chapter  Google Scholar 

  2. Alverson,, et al.: The Tera computer system. In: ICS 1990. Proceedings of the 4th International Conference on Supercomputing, Amsterdam, The Netherlands, pp. 1–6. ACM Press, New York (1990)

    Chapter  Google Scholar 

  3. Ang, B., Arvind, Chiou, D.: StarT the Next Generation: Integrating global caches and dataflow architecture. In: Advanced Topics in Dataflow Computing and Multithreading, IEEE Press, New York (1995)

    Google Scholar 

  4. Arvind, R.N., Pingali, K.: I-structures: data structures for parallel computing. ACM Transactions on Programming Languages and Systems (TOPLAS) 11(4), 598–632 (1989)

    Article  Google Scholar 

  5. Barth, P., Nikhil, R., Arvind.: M-structures: extending a parallel, non-strict, functional language with state. In: Proceedings of the 5th ACM Conference on Functional Programming Languages and Computer Architecture, Cambridge, MA, U.S, pp. 538–568. Springer, Heidelberg (1991)

    Google Scholar 

  6. Chen, D.-K., Su, H.-M., Yew, P.-C.: The impact of synchronization and granularity on parallel systems. In: ISCA 1990. Proceedings of the 17th Annual International Symposium on Computer Architecture, Seattle, Washington, pp. 239–248. ACM Press, New York (1990)

    Chapter  Google Scholar 

  7. Culler, D.E., Singh, J.P., Gupta, A.: Parallel Computer Architecture. Morgan Kaufmann, Seattle (1997)

    Google Scholar 

  8. Feo, J., Harper, D., Kahan, S., Konecny, P.: ELDORADO. In: CF 2005. Proceedings of the 2nd Conference on Computing Frontiers, Ischia, Italy, pp. 28–34. ACM Press, New York (2005)

    Chapter  Google Scholar 

  9. Goodman, J., Vernon, M., Woest, P.: Efficient synchronization primitives for large-scale cache-coherent multiprocessors. In: ASPLOS-III: Proceedings of the 3rd International Conference on Architectural Support for Programming Languages and Operating Systems, Boston, Massachusetts, pp. 64–75. ACM Press, New York (1989)

    Chapter  Google Scholar 

  10. Hammond, et al.: Transactional memory coherence and consistency. In: Proceedings of the 31st Annual International Symposium on Computer Architecture, p. 102. IEEE Computer Society, Los Alamitos (2004)

    Chapter  Google Scholar 

  11. Herlihy, M., Moss, J.: Transactional memory: architectural support for lock-free data structures. In: Proceedings of the 20th Annual International Symposium on Computer Architecture, San Diego, California, pp. 289–300. ACM Press, New York (1993)

    Chapter  Google Scholar 

  12. Kägi, A., Burger, D., Goodman, J.: Efficient synchronization: Let them eat QOLB. In: Proceedings of the 24th Annual International Symposium on Computer Architecture, Denver, Colorado, pp. 170–180. ACM Press, New York (1997)

    Google Scholar 

  13. Kim, N., Austin, T., Blaauw, D., Mudge, T., Flautner, K., Hu, J., Irwin, M., Kandemir, M., Narayanan, V.: Leakage current: Moore’s Law meets static power. IEEE Computer 36(12), 68–75 (2003)

    Google Scholar 

  14. Kranz, D., Lim, B.H., Agarwal, A., Yeung, D.: Low-cost support for fine-grain synchronization in multiprocessors. In: Multithreaded Computer Architecture: A Summary of the State of the Art, pp. 139–166. Kluwer Academic Publishers, Boston (1994)

    Google Scholar 

  15. Kroft, D.: Lockup-free instruction fetch/prefetch cache organization. In: ISCA 1998. 25 years of the International Symposia on Computer Architecture (selected papers), Barcelona, Spain, pp. 195–201. ACM Press, New York (1998)

    Chapter  Google Scholar 

  16. Lim, B.-H., Agarwal, A.: Reactive synchronization algorithms for multiprocessors. In: ASPLOS-VI. Proceedings of the 6th International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, CA, U.S, pp. 25–35. ACM Press, New York (1994)

    Chapter  Google Scholar 

  17. McDonald, A., Chung, J., Carlstrom, B., Minh, C., Chafi, H., Kozyrakis, C., Olukotun, K.: Architectural semantics for practical transactional memory. ACM SIGARCH Computer Architecture News 34(2), 53–65 (2006)

    Article  MathSciNet  Google Scholar 

  18. Merino, O.S., Vlassov, V., Moritz, C.A.: Performance implication of fine-grained synchronization in multiprocessors. Technical Report TRITAIMITLECS R 02:02, Department of Microelectronics and Information Technology (IMIT) Royal Institute of Technology (KTH), Stockholm, Sweden (2002)

    Google Scholar 

  19. Moore, K., Bobba, J., Moravan, M., Hill, M., Wood, D.: LogTM: Log-based transactional memory. In: Proceedings of the 12th International Symposium on High-Performance Computer Architecture, pp. 254–265 (February 2006)

    Google Scholar 

  20. Olukotun, K., Nayfeh, B., Hammond, L., Wilson, K., Chang, K.: The case for a single-chip multiprocessor. In: ASPLOS-VII. Proceedings of the 7th International Conference on Architectural Support for Programming Languages and Operating Systems, Cambridge, Massachusetts, pp. 2–11. ACM Press, New York (1996)

    Chapter  Google Scholar 

  21. Ronen, R., Mendelson, A., Lai, K., Lu, S.-L., Pollack, F., Shen, J.P.: Coming challenges in microarchitecture and architecture. Proceedings of the IEEE 89(3), 325–340 (2001)

    Article  Google Scholar 

  22. Sutter, H.: The free lunch is over: A fundamental turn toward concurrency in software. Dr. Dobb’s Journal 30(3) (March 2005)

    Google Scholar 

  23. Tullsen, D., Eggers, S., Levy, H.: Simultaneous multithreading: Maximizing on-chip parallelism. In: The 22th Annual International Symposium on Computer Architecture, Santa Margherita Ligure, Italy, pp. 392–403. ACM Press, New York (1995)

    Chapter  Google Scholar 

  24. Tullsen, D., Lo, J., Eggers, S., Levy, H.: Supporting fine-grained synchronization on a simultaneous multithreading processor. In: HPCA 1999. Proceedings of the 5th International Symposium on High Performance Computer Architecture, pp. 54–58. IEEE Computer Society, Los Alamitos (1999)

    Google Scholar 

  25. Vachharajani, N., Iyer, M., Ashok, C., Vachharajani, M., August, D., Connors, D.: Chip multi-processor scalability for single-threaded applications. SIGARCH Computer Architecture News 33(4), 44–53 (2005)

    Article  Google Scholar 

  26. Vlassov, V., Moritz, C.A.: Efficient fine grained synchronization support using full/empty tagged shared memory and cache coherency. Technical Report TRITA-IT-R 00:04, Deptartment of Teleinformatics, Royal Institute of Technology (KTH) (December 2000)

    Google Scholar 

  27. Xiaowei, S.: Implementing global cache coherence in *T-NG. Master’s thesis, Department of Electrical Engineering and Computer Science, MIT (May 1995)

    Google Scholar 

  28. Yeung, D., Agarwal, A.: Experience with fine-grain synchronization in MIMD machines for preconditioned conjugate gradient. In: PPOPP 1993. Proceedings of the 4th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, San Diego, CA, U.S, pp. 187–192. ACM Press, New York (1993)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Victor Malyshkin

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vlassov, V., Merino, O.S., Moritz, C.A., Popov, K. (2007). Support for Fine-Grained Synchronization in Shared-Memory Multiprocessors. In: Malyshkin, V. (eds) Parallel Computing Technologies. PaCT 2007. Lecture Notes in Computer Science, vol 4671. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73940-1_45

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73940-1_45

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73939-5

  • Online ISBN: 978-3-540-73940-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics