Skip to main content

Locality and false sharing in coherent-cache parallel graph reduction

  • Paper Sessions
  • Conference paper
  • First Online:
  • 701 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 694))

Abstract

Parallel graph reduction is a model for parallel program execution in which shared-memory is used under a strict access regime with single assignment and blocking reads. We outline the design of an efficient and accurate multiprocessor simulation scheme and the results of a simulation study of the performance of a suite of benchmark programs operating under a cache coherency protocol that is representative of protocols used in commercial shared-memory machines and in more scalable distributed shared-memory systems. We analyse the influence of cache line size on performance and expose the relative contributions of spatial, temporal and processor locality and false sharing to overall performance.

This is a preview of subscription content, log in via an institution.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anant Agarwal, Richard Simoni, John Hennessy, and Mark Horowitz. An evaluation of directory schemes for cache coherence. 15th Annual International Symposium on Computer Architecture, Honolulu, May, in Computer Architecture News, 16(2):280–289, May 1988.

    Google Scholar 

  2. James Archibald and Jean-Loup Baer. Cache coherence protocols: Evaluation using a multiprocessor simulation model. ACM Transactions on Computer Systems, 4(4):273–298, November 1986.

    Article  Google Scholar 

  3. Lennart Augustsson and Thomas Johnsson. Parallel graph reduction with the 〈gn, G〉-machine. In Fourth International Conference on Functional Programming Languages and Computer Architecture, London, September, pages 202–213, 1989.

    Google Scholar 

  4. Andrew J. Bennett. Parallel graph reduction for shared-memory architectures. PhD thesis, Department of Computing, Imperial College, London, 1993.

    Google Scholar 

  5. Andrew J. Bennett and Paul H. J. Kelly. Simulation of multicache parallel graph reduction. In Workshop on Parallel Implementations of Functional Languages, Aachen, Sept, 1992.

    Google Scholar 

  6. Lothar Borrmann and Petro Istavrinos. Store coherency in a parallel distributed memory machine. In Arndt Bode, editor, European Distributed Memory Conference, Munich, April 1991, volume 487 of Lecture Notes in Computer Science, pages 32–41, Berlin, 1991. Springer-Verlag.

    Google Scholar 

  7. Michel Dubois. Delayed consistency. In Michel Dubois and Shreekant S. Thakkar, editors, Workshop on Scalable Shared Memory Multiprocessors, Seattle, May, pages 207–218, Boston, 1992. Kluwer Academic Publishers.

    Google Scholar 

  8. Benjamin F. Goldberg. Multiprocessor Execution of Functional Programs. PhD thesis, Yale University, New Haven, 1988.

    Google Scholar 

  9. Atsuhiro Goto, Akira Matsumoto, and Evan Tick. Design and performance of a coherent cache for parallel logic programming architectures. 16th Annual International Symposium on Computer Architecture, Jerusalem, May, in Computer Architecture News, 17(3):25–33, June 1989.

    Google Scholar 

  10. Anoop Gupta and Wolf-Dietrich Weber. Cache invalidation patterns in sharedmemory multiprocessors. IEEE Transactions on Computers, To appear, 1992.

    Google Scholar 

  11. Pieter H. Hartel and Koen G. Langendoen. Benchmarking implementations of lazy functional languages. In Proceedings of the Conference, on Functional Programming Langauges and Computer Architecture, Copenhagen, June, 1993.

    Google Scholar 

  12. R. H. Katz, S. J. Eggers, D. A. Wood, C. L. Perkins, and R. G. Sheldon. Implementing a cache consistency protocol. 12th Annual International Symposium on Computer Architecture, Boston, June, in Computer Architecture News, 13(3):276–283, June 1985.

    Google Scholar 

  13. Koen Langendoen and Dirk-Jan Agterkamp. Cache behaviour of lazy functional programs. In Workshop on Parallel Implementations of Functional Languages, Aachen, Sept, 1992.

    Google Scholar 

  14. Kai Li and Paul Hudak. Memory coherence in shared virtual memory systems. ACM Transactions on Computer Systems, 7(4):321–359, November 1989.

    Article  Google Scholar 

  15. H. L. Muller, K. G. Langendoen, and L. O. Hertzberger. MiG: Simulating parallel functional programs on hierarchal cache architectures. Technical Report CS-92–04, Department of Computer Systems, University of Amsterdam, 1992.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Arndt Bode Mike Reeve Gottfried Wolf

Rights and permissions

Reprints and permissions

Copyright information

© 1993 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bennett, A.J., Kelly, P.H.J. (1993). Locality and false sharing in coherent-cache parallel graph reduction. In: Bode, A., Reeve, M., Wolf, G. (eds) PARLE '93 Parallel Architectures and Languages Europe. PARLE 1993. Lecture Notes in Computer Science, vol 694. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-56891-3_26

Download citation

  • DOI: https://doi.org/10.1007/3-540-56891-3_26

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-56891-9

  • Online ISBN: 978-3-540-47779-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics