Abstract
Parallel functional programs based on the graph reduction execution model display considerable locality of reference, favouring the use of large cache lines in the implementation of the shared heap on a shared-memory multiprocessor. They also display a very high rate of synchronisation, making conventional weakly-consistent coherency protocols ineffective at avoiding unnecessary contention for write access to cache lines due to false sharing. We present the design of a specially adapted cache coherency protocol and show results of simulation experiments which demonstrate that the protocol allows spatial locality to be exploited to at least the level of a conventional invalidation protocol, but without the unnecessary serialisation and network transactions caused by false sharing.
Preview
Unable to display preview. Download preview PDF.
References
Sarita V. Adve and Mark D. Hill. Weak ordering — a new definition and some implications. Technical Report 902, Computer Sciences Department, University of Wisconsin-Madison, 1989.
James Archibald and Jean-Loup Baer. Cache coherence protocols: evaluation using a multiprocessor simulation model. ACM Transactions on Computer Systems, 4(4):273–298, November 1986.
Lennart Augustsson and Thomas Johnsson. Parallel graph reduction with the 〈v, G〉-machine. In Fourth International Conference on Functional Programming Languages and Computer Architecture, London, September, pages 202–213, 1989.
Andrew J. Bennett. Parallel graph reduction for shared-memory architectures. PhD thesis, Department of Computing, Imperial College, London, 1993.
Andrew J. Bennett and Paul H. J. Kelly. Locality and false sharing in coherentcache parallel graph reduction. In Arndt Bode, Mike Reeve, and Gottfried Wolf, editors, PARLE 93 Parallel Architectures and Languages Europe, Munich, June 1993, volume 694 of Lecture Notes in Computer Science, pages 329–340, Berlin, 1993. Springer-Verlag.
D. I. Bevan. Distributed garbage collection using reference counting. In J. W. de Bakker, A. J. Nijman, and P. C. Treleaven, editors, PARLE 87 Parallel Architectures and Languages Europe, Eindhoven, June 1987, volume 259 of Lecture Notes in Computer Science, pages 176–187, Berlin, 1987. Springer-Verlag.
Stuart Cox, Shell-Ying Huang, Paul Kelly, Junxian Liu, and Frank Taylor. An implementation of static process networks. In D. Etiemble and J.-C Syre, editors, PARLE 92 Parallel Architectures and Languages Europe, Paris, June 1992, volume 605 of Lecture Notes in Computer Science, pages 497–512, Berlin, 1992. Springer-Verlag.
Benjamin F. Goldberg. Multiprocessor Execution of Functional Programs. PhD thesis, Yale University, New Haven, 1988.
Pieter H. Hartel and Koen G. Langendoen. Benchmarking implementations of lazy functional languages. In Proceedings of the Conference on Functional Programming Langauges and Computer Architecture, Copenhagen, June, 1993.
Paul Hudak, Simon L. Peyton Jones, and Philip Wadler. Report on the programming language Haskell — a non-strict purely functional language, version 1.2. SIGPLAN Notices, 27(5):1–162, May 1992.
Leslie Lamport. How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Transactions on Computers, C-28(9):690–691, September 1979.
David R. Lester. An efficient distributed garbage collection algorithm. In E. Odijk, M. Rem, and J.-C Syre, editors, PARLE 89 Parallel Architectures and Languages Europe, Eindhoven, June 1989, volume 365 of Lecture Notes in Computer Science, pages 207–223, Berlin, 1989. Springer-Verlag.
Kai Li and Paul Hudak. Memory coherence in shared virtual memory systems. ACM Transactions on Computer Systems, 7(4):321–359, November 1989.
Jaswinder Pal Singh, Wolf-Dietrich Weber, and Anoop Gupta. SPLASH: Stanford Parallel Applications for Shared Memory. Technical Report CSL-TR-91-469, Computer Systems Laboratory, Stanford University, 1991.
Josep Torrellas and John Hennessy. Estimating the performance advantages of relaxing consistency in a shared-memory multiprocessor. In International Conference on Parallel Processing, Pennsylvania State University, August, pages 26–34, 1990.
Willem Vree. Design Considerations for a Parallel Reduction Machine. PhD thesis, University of Amsterdam, 1989.
Paul Watson and Ian Watson. An efficient garbage collection scheme for parallel computer architectures. In J. W. de Bakker, A. J. Nijman, and P. C. Treleaven, editors, PARLE 87 Parallel Architectures and Languages Europe, Eindhoven, June 1987, volume 259 of Lecture Notes in Computer Science, pages 432–443, Berlin, 1987. Springer-Verlag.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1994 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bennett, A.J., Kelly, P.H.J. (1994). Eliminating invalidation in coherent-cache parallel graph reduction. In: Halatsis, C., Maritsas, D., Philokyprou, G., Theodoridis, S. (eds) PARLE'94 Parallel Architectures and Languages Europe. PARLE 1994. Lecture Notes in Computer Science, vol 817. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-58184-7_116
Download citation
DOI: https://doi.org/10.1007/3-540-58184-7_116
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58184-0
Online ISBN: 978-3-540-48477-6
eBook Packages: Springer Book Archive