Support for Fine-Grained Synchronization in Shared-Memory Multiprocessors

Vlassov, Vladimir; Merino, Oscar Sierra; Moritz, Csaba Andras; Popov, Konstantin

doi:10.1007/978-3-540-73940-1_45

Vladimir Vlassov¹,
Oscar Sierra Merino¹,
Csaba Andras Moritz² &
…
Konstantin Popov³

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4671))

Included in the following conference series:

International Conference on Parallel Computing Technologies

708 Accesses
3 Citations

Abstract

It has been already verified that hardware-supported finegrain synchronization provides a significant performance improvement over coarse-grained synchronization mechanisms, such as barriers. Support for fine-grain synchronization on individual data items becomes notably important in order to efficiently exploit thread-level parallelism available on multi-threading and multi-core processors. Fine-grained synchronization can be achieved using the full/empty tagged shared memory. We define the complete set of synchronizing memory instructions as well as the architecture of the full/empty tagged shared memory that provides support for these operations. We develop a snoopy cache coherency protocol for an SMP with the centralized full/empty tagged memory.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agarwal, et al.: The MIT Alewife machine: architecture and performance. In: ISCA 1995. Proceedings of the 22^nd Annual International Symposium on Computer Architecture, Margherita Ligure, Italy, pp. 2–13. ACM Press, New York (1995)
Chapter Google Scholar
Alverson,, et al.: The Tera computer system. In: ICS 1990. Proceedings of the 4^th International Conference on Supercomputing, Amsterdam, The Netherlands, pp. 1–6. ACM Press, New York (1990)
Chapter Google Scholar
Ang, B., Arvind, Chiou, D.: StarT the Next Generation: Integrating global caches and dataflow architecture. In: Advanced Topics in Dataflow Computing and Multithreading, IEEE Press, New York (1995)
Google Scholar
Arvind, R.N., Pingali, K.: I-structures: data structures for parallel computing. ACM Transactions on Programming Languages and Systems (TOPLAS) 11(4), 598–632 (1989)
Article Google Scholar
Barth, P., Nikhil, R., Arvind.: M-structures: extending a parallel, non-strict, functional language with state. In: Proceedings of the 5^th ACM Conference on Functional Programming Languages and Computer Architecture, Cambridge, MA, U.S, pp. 538–568. Springer, Heidelberg (1991)
Google Scholar
Chen, D.-K., Su, H.-M., Yew, P.-C.: The impact of synchronization and granularity on parallel systems. In: ISCA 1990. Proceedings of the 17th Annual International Symposium on Computer Architecture, Seattle, Washington, pp. 239–248. ACM Press, New York (1990)
Chapter Google Scholar
Culler, D.E., Singh, J.P., Gupta, A.: Parallel Computer Architecture. Morgan Kaufmann, Seattle (1997)
Google Scholar
Feo, J., Harper, D., Kahan, S., Konecny, P.: ELDORADO. In: CF 2005. Proceedings of the 2^nd Conference on Computing Frontiers, Ischia, Italy, pp. 28–34. ACM Press, New York (2005)
Chapter Google Scholar
Goodman, J., Vernon, M., Woest, P.: Efficient synchronization primitives for large-scale cache-coherent multiprocessors. In: ASPLOS-III: Proceedings of the 3^rd International Conference on Architectural Support for Programming Languages and Operating Systems, Boston, Massachusetts, pp. 64–75. ACM Press, New York (1989)
Chapter Google Scholar
Hammond, et al.: Transactional memory coherence and consistency. In: Proceedings of the 31st Annual International Symposium on Computer Architecture, p. 102. IEEE Computer Society, Los Alamitos (2004)
Chapter Google Scholar
Herlihy, M., Moss, J.: Transactional memory: architectural support for lock-free data structures. In: Proceedings of the 20th Annual International Symposium on Computer Architecture, San Diego, California, pp. 289–300. ACM Press, New York (1993)
Chapter Google Scholar
Kägi, A., Burger, D., Goodman, J.: Efficient synchronization: Let them eat QOLB. In: Proceedings of the 24^th Annual International Symposium on Computer Architecture, Denver, Colorado, pp. 170–180. ACM Press, New York (1997)
Google Scholar
Kim, N., Austin, T., Blaauw, D., Mudge, T., Flautner, K., Hu, J., Irwin, M., Kandemir, M., Narayanan, V.: Leakage current: Moore’s Law meets static power. IEEE Computer 36(12), 68–75 (2003)
Google Scholar
Kranz, D., Lim, B.H., Agarwal, A., Yeung, D.: Low-cost support for fine-grain synchronization in multiprocessors. In: Multithreaded Computer Architecture: A Summary of the State of the Art, pp. 139–166. Kluwer Academic Publishers, Boston (1994)
Google Scholar
Kroft, D.: Lockup-free instruction fetch/prefetch cache organization. In: ISCA 1998. 25 years of the International Symposia on Computer Architecture (selected papers), Barcelona, Spain, pp. 195–201. ACM Press, New York (1998)
Chapter Google Scholar
Lim, B.-H., Agarwal, A.: Reactive synchronization algorithms for multiprocessors. In: ASPLOS-VI. Proceedings of the 6^th International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, CA, U.S, pp. 25–35. ACM Press, New York (1994)
Chapter Google Scholar
McDonald, A., Chung, J., Carlstrom, B., Minh, C., Chafi, H., Kozyrakis, C., Olukotun, K.: Architectural semantics for practical transactional memory. ACM SIGARCH Computer Architecture News 34(2), 53–65 (2006)
Article MathSciNet Google Scholar
Merino, O.S., Vlassov, V., Moritz, C.A.: Performance implication of fine-grained synchronization in multiprocessors. Technical Report TRITAIMITLECS R 02:02, Department of Microelectronics and Information Technology (IMIT) Royal Institute of Technology (KTH), Stockholm, Sweden (2002)
Google Scholar
Moore, K., Bobba, J., Moravan, M., Hill, M., Wood, D.: LogTM: Log-based transactional memory. In: Proceedings of the 12th International Symposium on High-Performance Computer Architecture, pp. 254–265 (February 2006)
Google Scholar
Olukotun, K., Nayfeh, B., Hammond, L., Wilson, K., Chang, K.: The case for a single-chip multiprocessor. In: ASPLOS-VII. Proceedings of the 7^th International Conference on Architectural Support for Programming Languages and Operating Systems, Cambridge, Massachusetts, pp. 2–11. ACM Press, New York (1996)
Chapter Google Scholar
Ronen, R., Mendelson, A., Lai, K., Lu, S.-L., Pollack, F., Shen, J.P.: Coming challenges in microarchitecture and architecture. Proceedings of the IEEE 89(3), 325–340 (2001)
Article Google Scholar
Sutter, H.: The free lunch is over: A fundamental turn toward concurrency in software. Dr. Dobb’s Journal 30(3) (March 2005)
Google Scholar
Tullsen, D., Eggers, S., Levy, H.: Simultaneous multithreading: Maximizing on-chip parallelism. In: The 22^th Annual International Symposium on Computer Architecture, Santa Margherita Ligure, Italy, pp. 392–403. ACM Press, New York (1995)
Chapter Google Scholar
Tullsen, D., Lo, J., Eggers, S., Levy, H.: Supporting fine-grained synchronization on a simultaneous multithreading processor. In: HPCA 1999. Proceedings of the 5th International Symposium on High Performance Computer Architecture, pp. 54–58. IEEE Computer Society, Los Alamitos (1999)
Google Scholar
Vachharajani, N., Iyer, M., Ashok, C., Vachharajani, M., August, D., Connors, D.: Chip multi-processor scalability for single-threaded applications. SIGARCH Computer Architecture News 33(4), 44–53 (2005)
Article Google Scholar
Vlassov, V., Moritz, C.A.: Efficient fine grained synchronization support using full/empty tagged shared memory and cache coherency. Technical Report TRITA-IT-R 00:04, Deptartment of Teleinformatics, Royal Institute of Technology (KTH) (December 2000)
Google Scholar
Xiaowei, S.: Implementing global cache coherence in *T-NG. Master’s thesis, Department of Electrical Engineering and Computer Science, MIT (May 1995)
Google Scholar
Yeung, D., Agarwal, A.: Experience with fine-grain synchronization in MIMD machines for preconditioned conjugate gradient. In: PPOPP 1993. Proceedings of the 4^th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, San Diego, CA, U.S, pp. 187–192. ACM Press, New York (1993)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Royal Institute of Technology (KTH), Stockholm, Sweden
Vladimir Vlassov & Oscar Sierra Merino
University of Massachusetts (UMASS), Amherst, MA, U.S.A.
Csaba Andras Moritz
Swedish Institute of Computer Science (SICS), Stockholm, Sweden
Konstantin Popov

Authors

Vladimir Vlassov
View author publications
You can also search for this author in PubMed Google Scholar
Oscar Sierra Merino
View author publications
You can also search for this author in PubMed Google Scholar
Csaba Andras Moritz
View author publications
You can also search for this author in PubMed Google Scholar
Konstantin Popov
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Victor Malyshkin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Vlassov, V., Merino, O.S., Moritz, C.A., Popov, K. (2007). Support for Fine-Grained Synchronization in Shared-Memory Multiprocessors. In: Malyshkin, V. (eds) Parallel Computing Technologies. PaCT 2007. Lecture Notes in Computer Science, vol 4671. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73940-1_45

Download citation

DOI: https://doi.org/10.1007/978-3-540-73940-1_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73939-5
Online ISBN: 978-3-540-73940-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics