Abstract
Parallel graph reduction is a model for parallel program execution in which shared-memory is used under a strict access regime with single assignment and blocking reads. We outline the design of an efficient and accurate multiprocessor simulation scheme and the results of a simulation study of the performance of a suite of benchmark programs operating under a cache coherency protocol that is representative of protocols used in commercial shared-memory machines and in more scalable distributed shared-memory systems. We analyse the influence of cache line size on performance and expose the relative contributions of spatial, temporal and processor locality and false sharing to overall performance.
This is a preview of subscription content, log in via an institution.
Preview
Unable to display preview. Download preview PDF.
References
Anant Agarwal, Richard Simoni, John Hennessy, and Mark Horowitz. An evaluation of directory schemes for cache coherence. 15th Annual International Symposium on Computer Architecture, Honolulu, May, in Computer Architecture News, 16(2):280–289, May 1988.
James Archibald and Jean-Loup Baer. Cache coherence protocols: Evaluation using a multiprocessor simulation model. ACM Transactions on Computer Systems, 4(4):273–298, November 1986.
Lennart Augustsson and Thomas Johnsson. Parallel graph reduction with the 〈gn, G〉-machine. In Fourth International Conference on Functional Programming Languages and Computer Architecture, London, September, pages 202–213, 1989.
Andrew J. Bennett. Parallel graph reduction for shared-memory architectures. PhD thesis, Department of Computing, Imperial College, London, 1993.
Andrew J. Bennett and Paul H. J. Kelly. Simulation of multicache parallel graph reduction. In Workshop on Parallel Implementations of Functional Languages, Aachen, Sept, 1992.
Lothar Borrmann and Petro Istavrinos. Store coherency in a parallel distributed memory machine. In Arndt Bode, editor, European Distributed Memory Conference, Munich, April 1991, volume 487 of Lecture Notes in Computer Science, pages 32–41, Berlin, 1991. Springer-Verlag.
Michel Dubois. Delayed consistency. In Michel Dubois and Shreekant S. Thakkar, editors, Workshop on Scalable Shared Memory Multiprocessors, Seattle, May, pages 207–218, Boston, 1992. Kluwer Academic Publishers.
Benjamin F. Goldberg. Multiprocessor Execution of Functional Programs. PhD thesis, Yale University, New Haven, 1988.
Atsuhiro Goto, Akira Matsumoto, and Evan Tick. Design and performance of a coherent cache for parallel logic programming architectures. 16th Annual International Symposium on Computer Architecture, Jerusalem, May, in Computer Architecture News, 17(3):25–33, June 1989.
Anoop Gupta and Wolf-Dietrich Weber. Cache invalidation patterns in sharedmemory multiprocessors. IEEE Transactions on Computers, To appear, 1992.
Pieter H. Hartel and Koen G. Langendoen. Benchmarking implementations of lazy functional languages. In Proceedings of the Conference, on Functional Programming Langauges and Computer Architecture, Copenhagen, June, 1993.
R. H. Katz, S. J. Eggers, D. A. Wood, C. L. Perkins, and R. G. Sheldon. Implementing a cache consistency protocol. 12th Annual International Symposium on Computer Architecture, Boston, June, in Computer Architecture News, 13(3):276–283, June 1985.
Koen Langendoen and Dirk-Jan Agterkamp. Cache behaviour of lazy functional programs. In Workshop on Parallel Implementations of Functional Languages, Aachen, Sept, 1992.
Kai Li and Paul Hudak. Memory coherence in shared virtual memory systems. ACM Transactions on Computer Systems, 7(4):321–359, November 1989.
H. L. Muller, K. G. Langendoen, and L. O. Hertzberger. MiG: Simulating parallel functional programs on hierarchal cache architectures. Technical Report CS-92–04, Department of Computer Systems, University of Amsterdam, 1992.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1993 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bennett, A.J., Kelly, P.H.J. (1993). Locality and false sharing in coherent-cache parallel graph reduction. In: Bode, A., Reeve, M., Wolf, G. (eds) PARLE '93 Parallel Architectures and Languages Europe. PARLE 1993. Lecture Notes in Computer Science, vol 694. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-56891-3_26
Download citation
DOI: https://doi.org/10.1007/3-540-56891-3_26
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-56891-9
Online ISBN: 978-3-540-47779-2
eBook Packages: Springer Book Archive