Skip to main content

Netcaches on Engineering and Commercial Applications

  • Chapter
High Performance Computing Systems and Applications

Part of the book series: The Kluwer International Series in Engineering and Computer Science ((SECS,volume 657))

  • 205 Accesses

Abstract

This paper evaluates the performance benefits and problems associated to netcaches (network caches), or also known as shared remote access caches (RACs), in SMP-based multiprocessors for scientific and engineering and commercial applications. We consider and compare the effects of these caches onto two different memory models: sequential and release consistency. We use SMPs (Symmetric Multiprocessing) nodes as the building blocks for a multiprocessor due to its availability of cost-effective, which makes SMP nodes an attractive choice for modern and future designers.

As scientific/engineering applications, we simulate six applications from the SPLASH-2 benchmark suite. We compare the performance application for these six programs for two alternatives: (i) the baseline architecture (sequential consistency) and (ii) future architectures (release consistency). Such approaches are considered using netcaches in the system and when that cache is not included in the multiprocessor. We stimulate a machine with 32 processors and which are organized into SMP clusters. Similarly, we show the impact of netcaches on these systems for three different databases benchmarks: TPC-B, TPC-C and TPC-D.

Our simulation results show netcaches are more efficiently on commercial applications than scientific and engineering applications. Furthermore, the impact of these caches is more significantly on machines using release consistency than sequential model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Wulf, W.; McKee, S. Hitting the memory wall: Implications of the obvious. Computer Architecture News 23, 1, p.20–24, Mar., 1995.

    Article  Google Scholar 

  2. Saulsbury, A.; Pong, F.; Nowatzyk, A. Missing the memory wall: The case for Processor/Memory integration. In Proc. of ISCA-96, p. 90–101, May, 1996.

    Google Scholar 

  3. Nayfeh, Basem A; Hammond, L.; Olukotun, K. Evaluation of design alternatives for a multiprocessor microprocessor. In Proc. of ISCA-96, May, 1996.

    Google Scholar 

  4. Nayfeh, B.; Olukotun, K. Exploring the Design Space for a Shared-Cache Multiprocessor. In Proc. of ISCA-94, p. 166–175, April, 1994.

    Google Scholar 

  5. Sequent Company. The requirements and performance of enterprise computer solutions: SMP, clustered SMP and MPP. Technical report at http://www.sequent.com/news/papers, 1997.

  6. Clark, Roy. SCI interconnect chipset and adapter: Building large scale enterprise servers with Pentium Pro SHV nodes. Technical report at http://www.dq.com/products/aviion/numa. Data General Corporation, 1997 and Hot Interconnects IV and IC EXPO’96 conferences.

  7. Pentium Pro Family Developer’s Manual. Intel Corportation, 1996.

    Google Scholar 

  8. Lovett, Tom.; Clapp, Russell. STiNG: A CC-NUMA Computer System for the Commercial Marketplace. In Proc. of ISCA-96, May, 1996.

    Google Scholar 

  9. Kofuji, S.T. SPADE-2: A scalable distributed shared-memory parallel architecture. In Proc. of second-NOW workshop/ASPLOS VII, Oct., 1996.

    Google Scholar 

  10. Erlichson, A.; Nayfeh, B.; Singh, J.P.; Olukotun, K. The Benefits of Clustering in Shared Address Space Multiprocessors: An Applications-Driven Investigation. In Proc. of the Supercomputing, 1995.

    Google Scholar 

  11. Mukherjee, S.S.; Falsaki, B.; Hill, M. D.; Wood, D. A. Coherent Network Interface for Fine-Grain Communication. In Proc. of ISCA-96, May, 1996.

    Google Scholar 

  12. Mori, S. et Al. A Distributed Shared Memory Multiprocessor: ASURA — Memory and Caches Architectures. In Proc. of Super-computing 94, Vol. I, p. 740–749, 1994.

    Google Scholar 

  13. Moreno, E.D. et Al. A Preliminary Study of Network Caches on the NUMAchine Multiprocessor. Technical Report CSRI-350, Department of Computer Science, University of Toronto, Aug, 1996.

    Google Scholar 

  14. Bennett, J.K.; Fletcher, K.E.; Speight, W.E. The performance value of shared network caches in clustered multiprocessor workstations. In Proc. of the 16th. International Conference on Distributed Computing Systems, May, 1996.

    Google Scholar 

  15. Abdelrahman, Tarek et Al. An Overview of the NUMAchine Multiprocessor Project. In Proceedings Supercomputing Symposium’94, Toronto, Canada, p. 283–295, June, 1994.

    Google Scholar 

  16. Lenosky, Daniel E.; Weber, Wolf-Dietrich. Scalable Shared-Memory Multiprocessing. Morgan Kaufmann Publishers, San Francisco — California, 1995.

    Google Scholar 

  17. Blumrich, Matthias A. Et Al. Virtual memory Mapped Network Interface for the SHRIMP Multicomputer. In Proc. of the 21st Annual Intl. Symp. on Computer Architecture, p. 142–153, April 1994.

    Google Scholar 

  18. Kuskin, J. et Al. The Stanford FLASH Multiprocessor. In Proceedings of the 21st Annual Intl. Symp. on Computer Architecture, p. 302–313, April, 1994.

    Google Scholar 

  19. Veenstra, J.E.; Fowler, R.J. Mint: A front-end for efficient simulation of shared-memory multiprocessors. In Proc. of the 2nd. International workshop on Modeling, Analysis and Simulation of Computer and Telecommunications Systems (MASCOTS’94), 1994.

    Google Scholar 

  20. Wilton, S.J.E.; Jouppi, N.P. CACTI: An Enhanced Cache Access and Cycle Time Model. In IEEE Journal of Solid-State Circuits, Vol. 31, No. 5, p. 677–88, May, 1996.

    Article  Google Scholar 

  21. Woo, S.C.; Ohara, M.; Torrie, E.; Singh, J.P.; Gupta, A. The SPLASH-2 Programs: Characterization and Methodological Considerations. In Proc. of the ISCA-95, p. 24–36, June, 1995.

    Google Scholar 

  22. Cao, Q.; Trancoso, P.; et Al. Detailed Characterization of a Quad Pentium Pro Server Running TPC-D. In Proceedings of ICCD-99, US.A., Set., 1999.

    Google Scholar 

  23. Moreno, E.; Ucheroni, M.L.; Kofuji. S.T. Cache Performance on Parallel Hash Join Algorithms, HPCS-99, Kingston, Canada, 1999.

    Google Scholar 

  24. Barroso, A.L.; Gharachorloo, K.; Bugnion, E. Memory System Charaterization of Commercial Workloads. In Proceedings of ISCA-99, June, 1998.

    Google Scholar 

  25. DeSota, D.; Forester, R. Effectiveness of Remote Cache in a NUMA System. In Proceedings of Workshop on Computer Architecture Evaluation using Commercial Workloads. Feb., 1999.

    Google Scholar 

  26. Trancoso, P.; Larriba-Pey, J.L.; Zhang, Z.; Torrellas, J. The Memory Performance of DSS Commercial Workloads in Shared-Memory Multiprocessors. In Proceedings of HPCA-97, Feb. 1997.

    Google Scholar 

  27. Transaction Processing Performance Council. TPC Benchmark D Standard Specification Revision 1.3.1., 1998.

    Google Scholar 

  28. Transaction Processing Performance Council. TPC Benchmark C Standard Specification Revision 3.4.., 1998.

    Google Scholar 

  29. Keeton,K.; Patterson, D.; He, Y.; Raphael, R.; Baker, W. Performance Characterization of a Quad Pentium Pro SMP using OLTP Workloads. In Proceedings of ISCA-98. June, 1998.

    Google Scholar 

  30. Ranganathan, P.; Gharachorloo, K.; Adve, S.; Barroso, L.A. Performance of Database Workloads on Shared-Memory Systems with Out-of-order Processors. In Proceedings of ISCA-98. June, 1998.

    Google Scholar 

  31. Lo, J.; Barroso, L.A.; Eggers, S.; Gharachorloo, K.; Levy, H.; Parekh, S. An Analysis of Database Workload Performance on Simultaneous Multithreaded Processors. In Proc. of ISCA-98, June, 1998.

    Google Scholar 

  32. Moreno, Edward D. Remote Caches and Prefetching on SMP Cluster Multiprocessors: Architectural Issues. PhD Thesis, University of Sao Paulo, S.P., Brazil.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer Science+Business Media New York

About this chapter

Cite this chapter

Moreno, E.D. (2002). Netcaches on Engineering and Commercial Applications. In: Dimopoulos, N.J., Li, K.F. (eds) High Performance Computing Systems and Applications. The Kluwer International Series in Engineering and Computer Science, vol 657. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-0849-6_28

Download citation

  • DOI: https://doi.org/10.1007/978-1-4615-0849-6_28

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4613-5269-3

  • Online ISBN: 978-1-4615-0849-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics