Skip to main content

CMP Cache Architecture and the OpenMP Performance

  • Conference paper
A Practical Programming Model for the Multi-Core Era (IWOMP 2007)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4935))

Included in the following conference series:

  • 629 Accesses

Abstract

Chip-multiprocessor (CMP) is regarded as the next generation of microprocessor architectures. For programming such machines OpenMP, a standard shared memory model, is a challenging candidate. A question arises: How to design the CMP hardware for high performance of OpenMP applications?

This work explores the answer with cache architecture as a case study. Based on a simulator, we investigate how cache organization and reconfigurability influence the parallel execution of an OpenMP program. The achieved results can direct both architecture developers to determine hardware design and the programmers to generate efficient codes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Austin, T., Larson, E., Ernst, D.: SimpleScalar: An Infrastructure for Computer System Modeling. Computer 35(2), 59–67 (2002)

    Article  Google Scholar 

  2. Benitez, D., Moure, J.C., Rexachs, D.I., Luque, E.: Evaluation of the Field-programmable Cache: Performance and Energy Consumption. In: Proceedings of the 3rd conference on Computing frontiers (CF 2006), Ischia, Italy, May 2006, pp. 361–372 (2006)

    Google Scholar 

  3. Carvalho, M.B., Goes, L., Martins, C.: Dynamically Reconfigurable Cache Architecture Using Adaptive Block Allocation Policy. In: Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS) (April 2006)

    Google Scholar 

  4. Curtis-Maury, M., Ding, X., Antonopoulos, C., Nikolopoulos, D.: An Evaluation of OpenMP on Current and Emerging Multithreaded/Multicore Processors. In: Proceedings of the First International Workshop on OpenMP (IWOMP), Eugene, Oregon USA (June 2005)

    Google Scholar 

  5. Bailey, D., et al.: The NAS Parallel Benchmarks. Technical Report RNR-94-007, Department of Mathematics and Computer Science, Emory University (March 1994)

    Google Scholar 

  6. Saito, H., et al.: Large System Performance of SPEC OMP 2001 Benchmarks. In: Zima, H.P., Joe, K., Sato, M., Seo, Y., Shimasaki, M. (eds.) ISHPC 2002. LNCS, vol. 2327, pp. 370–379. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  7. Petoumenos, P., et al.: Modeling Cache Sharing on Chip Multiprocessor Architectures. In: Proceedings of the 2006 IEEE International Symposium of Workload Characterization (2006)

    Google Scholar 

  8. Chandra, R., et al.: Parallel Programming in OpenMP. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  9. Fung, S.: Improving Cache Locality for Thread-Level Speculation. Master’s thesis, University of Toronto (2005)

    Google Scholar 

  10. Gibson, J., Kunz, R., Ofelt, D., Horowitz, M., Hennessy, J., Heinrich, M.: FLASH vs (Simulated) FLASH: Closing the Simulation Loop. In: Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), November 2000, pp. 49–55 (2000)

    Google Scholar 

  11. Herrod, S.A.: Using Complete Machine Simulation to Understand Computer System Behavior. PhD thesis, Stanford University (February 1998)

    Google Scholar 

  12. Jin, H., Frumkin, M., Yan, J.: The OpenMP Implementation of NAS Parallel Benchmarks and Its Performance. Technical Report NAS-99-011, NASA Ames Research Center (October 1999)

    Google Scholar 

  13. Liu, C., Sivasubramaniam, A., Kandemir, M.: Organizing the Last Line of Defense Before Hitting the Memory Wall for CMPs. In: Proceedings of the International Symposium on High-Performance Computer Architecture (HPCA 2004), Madrid, Spain, February 2004, pp. 176–185 (2004)

    Google Scholar 

  14. Magnusson, P.S., Werner, B.: Efficient Memory Simulation in SimICS. In: Proceedings of the 8th Annual Simulation Symposium, Phoenix, Arizona, USA (April 1995)

    Google Scholar 

  15. Martonosi, M., Gupta, A., Anderson, T.: Tuning Memory Performance of Sequential and Parallel Programs. Computer 28(4), 32–40 (1995)

    Article  Google Scholar 

  16. Molnos, A.M., Cotofana, S.D., Heijligers, M.J.M., van Eijndhoven, J.T.J.: Static Cache Partitioning Robustness Analysis for Embedded On-chip Multi-processors. In: Proceedings of the 3rd conference on Computing frontiers (CF 2006), Ischia, Italy, May 2006, pp. 353–360 (2006)

    Google Scholar 

  17. Nethercote, N., Seward, J.: Valgrind: A Program Supervision Framework. In: Proceedings of the Third Workshop on Runtime Verification (RV 2003), Boulder, Colorado, USA (July 2003), http://developer.kde.org/~sewardj

  18. Pacheco, P.: Parallel Programming with MPI. Morgan Kaufmann, San Francisco (1996)

    Google Scholar 

  19. WWW. Cachegrind: a Cache-miss Profiler, http://developer.kde.org/~sewardj/docs-2.2.0/cg_main.html#cg-top

Download references

Author information

Authors and Affiliations

Authors

Editor information

Barbara Chapman Weiming Zheng Guang R. Gao Mitsuhisa Sato Eduard Ayguadé Dongsheng Wang

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tao, J., Hoàng, K.D., Karl, W. (2008). CMP Cache Architecture and the OpenMP Performance. In: Chapman, B., Zheng, W., Gao, G.R., Sato, M., Ayguadé, E., Wang, D. (eds) A Practical Programming Model for the Multi-Core Era. IWOMP 2007. Lecture Notes in Computer Science, vol 4935. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69303-1_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-69303-1_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-69302-4

  • Online ISBN: 978-3-540-69303-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics