Skip to main content

Using Partial Tag Comparison in Low-Power Snoop-Based Chip Multiprocessors

  • Conference paper
Computer Architecture (ISCA 2010)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6161))

Included in the following conference series:

Abstract

In this work we introduce power optimizations relying on partial tag comparison (PTC) in snoop-based chip multiprocessors. Our optimizations rely on the observation that detecting tag mismatches in a snoop-based chip multiprocessor does not require aggressively processing the entire tag. In fact, a high percentage of cache mismatches could be detected by utilizing a small subset but highly informative portion of the tag bits.

Based on this, we introduce a source-based snoop filtering mechanism referred to as S-PTC. In S-PTC possible remote tag mismatches are detected prior to sending the request. We reduce power as S-PTC prevents sending unnecessary snoops and avoids unessential tag lookups at the end-points. Furthermore, S-PTC improves performance as a result of early cache miss detection.

S-PTC improves average performance from 2.9% to 3.5% for different configurations and for the SPLASH-2 benchmarks used in this study. Our solutions reduce snoop request bandwidth from 78.5% to 81.9% and average tag array dynamic power by about 52%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Adve, S.V., Gharachorloo, K.: Shared Memory Consistency Models: A Tutorial. Computer 29(12), 66–76 (1996)

    Article  Google Scholar 

  2. IBM. Power4, http://www.research.ibm.com/power4

  3. Agrawal, N., Peh, L.-S., Jha, N.K.: In-Network Coherence Filtering: Snoop Coherence without Broadcast. In: Proceedings of International Symposium on Microarchitecture, New York City, New York (December 2009)

    Google Scholar 

  4. Moshovos, A.: RegionScout: Exploiting Coarse Grain Sharing in Snoop-Based Coherence. In: Proceedings of International Symposium on Computer Architecture (June 2005)

    Google Scholar 

  5. Cantin, J.F., Lipasti, M.H., Smith, J.E.: Improving Multiprocessor Performance with Coarse-Grain Coherence Tracking. In: Proceeding of the International Symposium on Computer Architecture (June 2005)

    Google Scholar 

  6. Salapura, V., Blumrich, M., Gara, A.: Design and Implementation of the Blue Gene/P Snoop Filter. In: Proceedings of International Symposium on High Performance Computer Architecture (February 2007)

    Google Scholar 

  7. Ballapuram, C.S., Sharif, A., Lee, H.-H.S.: Exploiting Access Semantics and Program Behavior to Reduce Snoop Power in Chip Multiprocessors. In: Proceeding of the International Conference on Architectural Support for Programming Languages and Operating Systems (March 2008)

    Google Scholar 

  8. Kumar, R., Zyuban, V., Tullsen, D.: Interconnections in Multi-core Architectures: Understanding Mechanisms, Overheads and Scaling. In: ISCA (June 2005)

    Google Scholar 

  9. Woo, S.C., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The SPLASH-2 Programs: Characterization and Methodological Considerations. In: International Symposium on Computer Architecture, Santa Margherita Ligure, Italy, pp. 24–36 (June 1995)

    Google Scholar 

  10. University of Illinois at Urbana-Champaign (2005), http://sesc.sourceforge.net

  11. Sun Niagara, http://www.sun.com/processors/throughput/

  12. Muralimanohar, N., Balasubramonian, R., Jouppi, N.: Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0. In: Proceedings of the 40th International Symposium on Microarchitecture (December 2007)

    Google Scholar 

  13. Cheng, L., et al.: Interconnect-Aware Coherence Protocols for Chip Multiprocessors. In: Proceeding 33rd International Symposium on Computer Architecture, pp. 339–351. IEEE CS Press, Los Alamitos (2006)

    Google Scholar 

  14. Bilir, E.E., Dickson, R.M., Hu, Y., Plakal, M., Sorin, D.J., Hill, M.D., Wood, D.A.: Multicast Snooping: A New Coherence Method using a Multicast Address Network. SIGARCH Computer Architecture News, 294–304 (1999)

    Google Scholar 

  15. Martin, M.M.K., Harper, P.J., Sorin, D.J., Hill, M.D., Wood, D.A.: Using Destination-Set Prediction to Improve the Latency/Bandwidth Tradeoff in Shared-Memory Multiprocessors. In: Proceedings of International Symposium on Computer Architecture (June 2003)

    Google Scholar 

  16. Atoofian, E., Baniasadi, A.: Using Supplier Locality in Power-Aware Interconnects and Caches in Chip Multiprocessors. Journal of Systems Architecture 54(5), 507–518 (2007)

    Article  Google Scholar 

  17. Moshovos, A., Memik, G., Falsafi, B., Choudhary, A.: Jetty: Filtering Snoops for Reduced Energy Consumption in SMP Servers. In: Proceeding of the 7th International Symposium on High- Performance Computer Architecture (January 2001)

    Google Scholar 

  18. Ekman, M., Dahlgren, F., Stenstrm, P.: TLB and Snoop Energy-Reduction Using Virtual Caches for Low-Power Chip-Multiprocessors. In: Proceeding of ACM International Symposium on Low Power Electronics and Design (August 2002)

    Google Scholar 

  19. Bloom, B.H.: Space/Time Trade-offs in Hash Coding with Allowable Errors. Communication of the ACM (1970)

    Google Scholar 

  20. Saldanha, C., Lipasti, M.H.: Power Efficient Cache Coherence, High Performance Memory Systems. In: Hadimiouglu, H., Kaeli, D., Kuskin, J., Nanda, A., Torrellas, J. (eds.). Springer, Heidelberg (2003)

    Google Scholar 

  21. Strauss, K., Shen, X., Torrellas, J.: Flexible Snooping: Adaptive Forwarding and Filtering of Snoops in Embedded-Ring Multiprocessors. In: International Symposium on Computer Architecture, Boston, MA (June 2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Shafiee, A., Shahidi, N., Baniasadi, A. (2011). Using Partial Tag Comparison in Low-Power Snoop-Based Chip Multiprocessors. In: Varbanescu, A.L., Molnos, A., van Nieuwpoort, R. (eds) Computer Architecture. ISCA 2010. Lecture Notes in Computer Science, vol 6161. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24322-6_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-24322-6_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-24321-9

  • Online ISBN: 978-3-642-24322-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics