Abstract
Chip multiprocessors (CMPs) are becoming the trend of mainstream computing platforms. The design of an efficient on-chip memory hierarchy is one of the key challenges in computer architecture. Tiled architecture and non-uniform cache architecture (NUCA) is commonly adopted in modern CMPs. Previous efforts on cache replacement policy usually assume an unified last-level cache or running multiprogrammed workloads. However, few researches focus on the replacement policy of cache clustering scheme running parallel workloads. Cache clustering scheme can improve the system performance on parallel performance, which is a tradeoff between shared cache organization and private cache organization which adopts cache replication. In cache clustering scheme, cache blocks in last-level cache can be subdivided into eight types.
In this work we propose Data access Type Aware Replacement Policy (DTARP) for cache clustering organization, DTARP classifies data blocks in last-level cache into different access types, and designs the insertion and the victim selection policies according to different data access types based on traditional LRU policy. The global shared data will be kept in last-level cache longer than before. Simulation results show that DTARP can improve the system performance of cluster scheme using LRU policy by 10.9% on average.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Lee, D., Choi, J., Kim, J.H., Noh, S.H., Min, S.L., Cho, Y., Kim, C.S.: LRFU: A Spectrum of Policies that Subsumes the Least Recently Used and Least Frequently Used Policies. IEEE Transactions on Computers 50(12), 1352–1361 (2001)
Howard, J., Dighe, S., Hoskote, Y., et al.: A 48-Core IA-32 Message-Passing Processor with DVFS in 45nm CMOS. In: ISSCC 2010, pp. 108–109 (2010)
Kim, C., Burger, D., Keckler, S.W.: An Adaptive, Non-uniform Cache Structure for Wire-delay Dominated On-chip Caches. In: ASPLOS-X, pp. 211–222 (2002)
Qureshi, M., Jaleel, A., Patt, Y., Steely Jr., S.C., Emer, J.: Adaptive Insertion Policies for High Performance Caching. In: ISCA-34, pp. 167–178 (2007)
Jaleel, A., Hasenplaugh, W., Qureshi, M., Sebot, J., Steely Jr., S.C., Emer, J.: Adaptive insertion policies for managing shared caches. In: PACT-17, pp. 208–219 (2008)
Jaleel, A., Theobald, K.B., Steely Jr., S.C., Emer, J.: High performance cache replacement using re-reference interval prediction (RRIP). In: ISCA-37, pp. 60–71 (2010)
Stone, H.S., Turek, J., Wolf, J.L.: Optimal Partitioning of Cache Memory. IEEE Transactions on Computers 41(9), 1054–1068 (1992)
Suh, G.E., Rudolph, L., Devadas, S.: Dynamic Partitioning of Shared Cache Memory. Journal of Supercomputing 28(1), 7–26 (2004)
Qureshi, M.K., Patt, Y.: Utility Based Cache Partitioning: A Low Overhead High-Performance Runtime Mechanism to Partition Shared Caches. In: MICRO-39 (2006)
Xie, Y., Loh, G.H.: PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches. In: ISCA-36, pp. 174–183 (2009)
Zhang, X., Li, C., Wang, H., Wang, D.: A Cache Replacement Policy Using Adaptive Insertion and Re-reference Prediction. In: SBAC-PAD-22, pp. 95–102 (2010)
Zhang, X., Li, C., Liu, Z., Wang, H., Wang, D., Ikenaga, T.: A Novel Cache Replacement Policy via Dynamic Adaptive Insertion and Re-Reference Prediction. IEICE Transactions on Electronics E94-C(4), 468–478 (2011)
Martin, M.M., Sorin, D.J., et al.: Multifacet’s General Execution-driven Multiprocessor Simulator (GEMS) Toolset. Computer Architecture News (CAN) 33(4), 92–99 (2005)
Magnusson, P., Christensson, M., et al.: Simics: A Full System Simulation Platform. Computer 35(2), 50–58 (2002)
Huh, J., Kim, C., Shafi, H., Zhang, L., Burger, D., Keckler, S.W.: A NUCA Substrate for Flexible CMP Cache Sharing. In: ICS-19, pp. 31–40 (2005)
Mohammad, H., Sangyeun, C., Rami, M.: Dynamic Cache Clustering for Chip Multiprocessors. In: ICS-23, pp. 56–67 (2009)
Woo, S.C., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The SPLASH-2 Programs: Characterization and Methodological Considerations. Computer Architecture News 23(2), 24–36 (1995)
Bienia, C., Kumar, S., Clara, S., Singh, J.P., Li, K.: The PARSEC Benchmark Suite: Characterization and Architectural Implications. In: PACT-17, pp. 272–281 (2008)
Nikos, H., Michael, F., Babak, F., Anastasia, A.: Reactive NUCA: Near-optimal Block Placement and Replication in Distributed Caches. In: ISCA-36, pp. 184–195 (2009)
Marty, M.R., Hill, M.D.: Virtual Hierarchies to Support Server Consolidation. In: ISCA-34, pp. 46–56 (2007)
Jaleel, A., Najaf-abadi, H.H., Subramaniam, S., Steely Jr., S.C., Emer, J.: CRUISE: Cache Replacement and Utility-aware Scheduling. In: ASPLOS-XVII, pp. 249–260 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Li, C., Wang, D., Wang, H., Li, G., Xue, Y. (2013). Data Access Type Aware Replacement Policy for Cache Clustering Organization of Chip Multiprocessors. In: Wu, C., Cohen, A. (eds) Advanced Parallel Processing Technologies. APPT 2013. Lecture Notes in Computer Science, vol 8299. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45293-2_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-45293-2_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45292-5
Online ISBN: 978-3-642-45293-2
eBook Packages: Computer ScienceComputer Science (R0)