Abstract
As multiprocessors are scaled beyond single bus systems, there is renewed interest in directory-based cache coherence schemes. These schemes rely on a directory to keep track of all processors caching a memory block. When a write to that block occurs, point-to-point invalidation messages are sent to keep the caches coherent. A straightforward way of recording the identities of processors caching a memory block is to use a bit vector per memory block, with one bit per processor. Unfortunately, when the main memory grows linearly with the number of processors, the total size of the directory memory grows as the square of the number of processors, which is prohibitive for large machines. To remedy this problem several schemes that use a limited number of pointers per directory entry have been suggested. These schemes often cause excessive invalidation traffic.
In this paper, we propose two simple techniques that significantly reduce invalidation traffic and directory memory requirements. First, we present the coarse vector as a novel way of keeping directory state information. This scheme uses as little memory as other limited pointer schemes, but causes significantly less invalidation traffic. Second, we propose sparse directories, where one directory entry is associated with several memory blocks, as a technique for greatly reducing directory memory requirements. The paper presents an evaluation of the proposed techniques in the context of the Stanford DASH multiprocessor architecture. Results indicate that sparse directories coupled with coarse vectors can save one to two orders of magnitude in storage, with only a slight degradation in performance.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This paper also appeared in the Proceedings of the International Conference on Parallel Processing, August 1990
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Anant Agarwal, Richard Simoni, John Hennessy, and Mark Horowitz. An Evaluation of Directory Schemes for Cache Coherence. In 15th International Symposium on Computer Architecture, 1988.
James Archibald and Jean-Loup Baer. Cache Coherence Protocols: Evaluation Using a Multiprocessor Simulation Model. ACM Transactions on Computer Systems, 4(4):273–298, 1986.
James K. Archibald. The Cache Coherence Problem in Shared-Memory Multiprocessors. PhD thesis, Department of Computer Science, University of Washington, February 1987.
M. Censier and P. Feautier. A New Solution to Coherence Problems in Mul-ticache Systems. IEEE Transactions on Computers, C-27(12): 1112–1118, December 1978.
H. Davis, S. Goldschmidt, and J. Hennessy. Tango: A Multiprocessor Simulation and Tracing System. Stanford Technical Report — in preparation, 1989.
S. Eggers and R. Katz. The Effect of Sharing on the Cache and Bus Performance of Parallel Programs. In Proceedings of the Third International Conference on Architectural Support for Programming Languages and Operating Systems, pages 257–270, May 1989.
Encore Computer Corporation. Multimax Technical Summary, 1986.
P1596 Working Group. P1596/Part IIIA—SCI Cache Coherence Overview. Technical Report Revision 0.33, IEEE Computer Society, November 1989.
Tom Knight, March 1987. Talk at Stanford Computer Systems Laboratory.
D. Lenoski, J. Laudon, K. Gharachorloo, A. Gupta, and J. Hennessy. The Directory-Based Cache Coherence Protocol for the DASH Multiprocessor. In Proceedings of 17th International Symposium on Computer Architecture, 1990.
Dan Lenoski, James Laudon, Kourosh Gharachorloo, Anoop Gupta, John Hennessy, Mark Horowitz, and Monica Lam. Design of Scalable Shared-Memory Multiprocessors: The DASH Approach. In Proceedings of COMPCON’90, pages 62–67, 1990.
Tom Lovett and Shreekant Thakkar. The Symmetry Multiprocessor System. In Proc. of the International Conference on Parallel Processing, volume I, pages 303–310, August 1988.
M. Papamarcos and J. Patel. A low Overhead Coherence Solution for Multiprocessors with private Cache Memories. In Proceedings of 11th International Symposium on Computer Architecture, pages 348–354, 1984.
C. K. Tang. Cache Design in the Tightly Coupled Multiprocessor System. In AFIPS Conference Proceedings, National Computer Conference, NY, NY, pages 749–753, June 1976.
Wolf-Dietrich Weber and Anoop Gupta. Analysis of Cache Invalidation Patterns in Multiprocessors. In Proceedings of the Third International Conference on Architectural Support for Programming Languages and Operating Systems, pages 243–256, April 1989.
John Willis. Cache Coherence in Systems with Parallel Communication Channels&Many Processors. Technical Report TR-88-013, Philips Laboratories — Briarcliff, March 1988.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1992 Springer Science+Business Media New York
About this chapter
Cite this chapter
Gupta, A., Weber, WD., Mowry, T. (1992). Reducing Memory and Traffic Requirements for Scalable Directory-Based Cache Coherence Schemes*. In: Dubois, M., Thakkar, S. (eds) Scalable Shared Memory Multiprocessors. Springer, Boston, MA. https://doi.org/10.1007/978-1-4615-3604-8_9
Download citation
DOI: https://doi.org/10.1007/978-1-4615-3604-8_9
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4613-6601-0
Online ISBN: 978-1-4615-3604-8
eBook Packages: Springer Book Archive