CMP Cache Architecture and the OpenMP Performance

Tao, Jie; Hoàng, Kim D.; Karl, Wolfgang

doi:10.1007/978-3-540-69303-1_7

Jie Tao^1,2,
Kim D. Hoàng³ &
Wolfgang Karl³

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4935))

Included in the following conference series:

International Workshop on OpenMP

629 Accesses

Abstract

Chip-multiprocessor (CMP) is regarded as the next generation of microprocessor architectures. For programming such machines OpenMP, a standard shared memory model, is a challenging candidate. A question arises: How to design the CMP hardware for high performance of OpenMP applications?

This work explores the answer with cache architecture as a case study. Based on a simulator, we investigate how cache organization and reconfigurability influence the parallel execution of an OpenMP program. The achieved results can direct both architecture developers to determine hardware design and the programmers to generate efficient codes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Austin, T., Larson, E., Ernst, D.: SimpleScalar: An Infrastructure for Computer System Modeling. Computer 35(2), 59–67 (2002)
Article Google Scholar
Benitez, D., Moure, J.C., Rexachs, D.I., Luque, E.: Evaluation of the Field-programmable Cache: Performance and Energy Consumption. In: Proceedings of the 3rd conference on Computing frontiers (CF 2006), Ischia, Italy, May 2006, pp. 361–372 (2006)
Google Scholar
Carvalho, M.B., Goes, L., Martins, C.: Dynamically Reconfigurable Cache Architecture Using Adaptive Block Allocation Policy. In: Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS) (April 2006)
Google Scholar
Curtis-Maury, M., Ding, X., Antonopoulos, C., Nikolopoulos, D.: An Evaluation of OpenMP on Current and Emerging Multithreaded/Multicore Processors. In: Proceedings of the First International Workshop on OpenMP (IWOMP), Eugene, Oregon USA (June 2005)
Google Scholar
Bailey, D., et al.: The NAS Parallel Benchmarks. Technical Report RNR-94-007, Department of Mathematics and Computer Science, Emory University (March 1994)
Google Scholar
Saito, H., et al.: Large System Performance of SPEC OMP 2001 Benchmarks. In: Zima, H.P., Joe, K., Sato, M., Seo, Y., Shimasaki, M. (eds.) ISHPC 2002. LNCS, vol. 2327, pp. 370–379. Springer, Heidelberg (2002)
Chapter Google Scholar
Petoumenos, P., et al.: Modeling Cache Sharing on Chip Multiprocessor Architectures. In: Proceedings of the 2006 IEEE International Symposium of Workload Characterization (2006)
Google Scholar
Chandra, R., et al.: Parallel Programming in OpenMP. Morgan Kaufmann, San Francisco (2000)
Google Scholar
Fung, S.: Improving Cache Locality for Thread-Level Speculation. Master’s thesis, University of Toronto (2005)
Google Scholar
Gibson, J., Kunz, R., Ofelt, D., Horowitz, M., Hennessy, J., Heinrich, M.: FLASH vs (Simulated) FLASH: Closing the Simulation Loop. In: Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), November 2000, pp. 49–55 (2000)
Google Scholar
Herrod, S.A.: Using Complete Machine Simulation to Understand Computer System Behavior. PhD thesis, Stanford University (February 1998)
Google Scholar
Jin, H., Frumkin, M., Yan, J.: The OpenMP Implementation of NAS Parallel Benchmarks and Its Performance. Technical Report NAS-99-011, NASA Ames Research Center (October 1999)
Google Scholar
Liu, C., Sivasubramaniam, A., Kandemir, M.: Organizing the Last Line of Defense Before Hitting the Memory Wall for CMPs. In: Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA 2004), Madrid, Spain, February 2004, pp. 176–185 (2004)
Google Scholar
Magnusson, P.S., Werner, B.: Efficient Memory Simulation in SimICS. In: Proceedings of the 8th Annual Simulation Symposium, Phoenix, Arizona, USA (April 1995)
Google Scholar
Martonosi, M., Gupta, A., Anderson, T.: Tuning Memory Performance of Sequential and Parallel Programs. Computer 28(4), 32–40 (1995)
Article Google Scholar
Molnos, A.M., Cotofana, S.D., Heijligers, M.J.M., van Eijndhoven, J.T.J.: Static Cache Partitioning Robustness Analysis for Embedded On-chip Multi-processors. In: Proceedings of the 3rd conference on Computing frontiers (CF 2006), Ischia, Italy, May 2006, pp. 353–360 (2006)
Google Scholar
Nethercote, N., Seward, J.: Valgrind: A Program Supervision Framework. In: Proceedings of the Third Workshop on Runtime Verification (RV 2003), Boulder, Colorado, USA (July 2003), http://developer.kde.org/~sewardj
Pacheco, P.: Parallel Programming with MPI. Morgan Kaufmann, San Francisco (1996)
Google Scholar
WWW. Cachegrind: a Cache-miss Profiler, http://developer.kde.org/~sewardj/docs-2.2.0/cg_main.html#cg-top

Download references

Author information

Authors and Affiliations

Department of Computer Science and Technology, Jilin University, P.R. China
Jie Tao
Institut für wissenschaftliches Rechnen, Forschungszentrum Karlsruhe GmbH, Postfach 3640, 76021, Karlsruhe,
Jie Tao
Institut für Technische Informatik, Universität Karlsruhe (TH), 76128, Karlsruhe, Germany
Kim D. Hoàng & Wolfgang Karl

Authors

Jie Tao
View author publications
You can also search for this author in PubMed Google Scholar
Kim D. Hoàng
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang Karl
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Barbara Chapman Weiming Zheng Guang R. Gao Mitsuhisa Sato Eduard Ayguadé Dongsheng Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tao, J., Hoàng, K.D., Karl, W. (2008). CMP Cache Architecture and the OpenMP Performance. In: Chapman, B., Zheng, W., Gao, G.R., Sato, M., Ayguadé, E., Wang, D. (eds) A Practical Programming Model for the Multi-Core Era. IWOMP 2007. Lecture Notes in Computer Science, vol 4935. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69303-1_7

Download citation

DOI: https://doi.org/10.1007/978-3-540-69303-1_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69302-4
Online ISBN: 978-3-540-69303-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics