Last Bank: Dealing with Address Reuse in Non-Uniform Cache Architecture for CMPs

Lira, Javier; Molina, Carlos; González, Antonio

doi:10.1007/978-3-642-03869-3_30

Javier Lira¹⁷,
Carlos Molina¹⁸ &
Antonio González¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5704))

Included in the following conference series:

European Conference on Parallel Processing

1180 Accesses
1 Citations

Abstract

In response to the constant increase in wire delays, Non-Uniform Cache Architecture (NUCA) has been introduced as an effective memory model for dealing with growing memory latencies. This architecture divides a large memory cache into smaller banks that can be accessed independently. Banks close to the cache controller therefore have a faster response time than banks located farther away from it. In this paper, we propose and analyse the insertion of an additional bank into the NUCA cache. This is called Last Bank. This extra bank deals with data blocks that have been evicted from the other banks in the NUCA cache. Furthermore, we analyse the behaviour of the cache line replacements done in the NUCA cache and propose two optimisations of Last Bank that provide significant performance benefits without incurring unaffordable implementation costs.

Download to read the full chapter text

Chapter PDF

Towards Efficient Dynamic LLC Home Bank Mapping with NoC-Level Support

Reducing the second-level cache conflict misses using a set folding technique

Article 01 November 2017

COMPESCE: A Co-design Approach for Memory Subsystem Performance Analysis in HPC Many-Cores

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Gorder, P.F.: Multicore processors for science and engineering. In: Computing in Science & Engineering (March-April 2007)
Google Scholar
Chang, J., Sohi, G.S.: Cooperative caching for chip multiprocessors. In: Procs. of the 33rd International Symposium on Computer Architecture, ISCA 2006 (2006)
Google Scholar
Chang, J., Sohi, G.S.: Cooperative cache partitioning for chip multiprocessors. In: Procs. of the 21st ACM International Conference on Supercomputing, ICS-21 (2007)
Google Scholar
Zhang, M., Asanović, K.: Victim replication: Maximizing capacity while hiding wire delay in tiled chip multiprocessors. In: Procs. of the 32nd International Symposium on Computer Architecture, ISCA 2005 (2005)
Google Scholar
Beckmann, B.M., Marty, M.R., Wood, D.A.: Asr: Adaptive selective replication for cmp caches. In: 39th Annual IEEE/ACM International Symposium of Microarchitecture, MICRO-39 (2006)
Google Scholar
Dybdahl, H., Stenström, P.: An adaptive shared/private nuca cache partitioning scheme for chip multiprocessors. In: IEEE 13th International Symposium on High-Performance Computer Architecture (2007)
Google Scholar
Guz, Z., Keidar, I., Kolodny, A., Weiser, U.C.: Nahalal: Cache organization for chip multiprocessors. IEEE Computer Architecture Letters (2007)
Google Scholar
Agarwal, V., Hrishikesh, M.S., Keckler, S.W., Burger, D.: Clock rate vs. ipc: The end of the road for conventional microprocessors. In: 27th International Symposium on Computer Architecture (2000)
Google Scholar
Kim, C., Burger, D., Keckler, S.W.: An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches. In: 10th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS (October 2002)
Google Scholar
Beckmann, B.M., Wood, D.A.: Managing wire delay in large chip-multiprocessor caches. In: 37th International Symposium on Microarchitecture, MICRO-37 (2004)
Google Scholar
Magnusson, P.S., Christensson, M., Eskilson, J., Forsgren, D., Hallberg, G., Högberg, J., Larsson, F., Moestedt, A., Werner, B.: Simics: A Full System Simulator Platform. Computer 35(2), 50–58 (2002)
Article Google Scholar
Martin, M.M.K., Sorin, D.J., Beckmann, B.M., Marty, M.R., Xu, M., Alameldeen, A.R., Moore, K.E., Hill, M.D., Wood, D.A.: Multifacet’s general execution-driven multiprocessor simulator (gems) toolset. Computer Architecture News (September 2005)
Google Scholar
Muralimanohar, N., Balasubramonian, R., Jouppi, N.P.: Optimizing nuca organizations and wiring alternatives for large caches with cacti 6.0. In: 40th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO-40 (2007)
Google Scholar
Bienia, C., Kumar, S., Singh, J.P., Li, K.: The parsec benchmark suite: Characterization and architectural implications. In: Procs. of the 17th International Conference on Parallel Architectures and Compilation Techniques (October 2008)
Google Scholar
Chishti, Z., Powell, M.D., Vijaykumar, T.N.: Distance associativity for high-performance energy-efficient non-uniform cache architectures. In: Proceedings of the 36th International Symposium on Microarchitecture, MICRO-36 (2003)
Google Scholar
Huh, J., Kim, C., Shafi, H., Zhang, L., Burger, D., Keckler, S.W.: A nuca substrate for flexible cmp cache sharing. In: Procs. of the 19th ACM International Conference on Supercomputing, ICS-19 (2005)
Google Scholar
Kandemir, M., Li, F., Irwin, M.J., Son, S.W.: A novel migration-based nuca design for chip multiprocessors. In: ACM/IEEE conference on Supercomputing (2008)
Google Scholar
Muralimanohar, N., Balasubramonian, R.: Interconnect design considerations for large nuca caches. In: Procs. of the 34th International Symposium on Computer Architecture, ISCA 2007 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Universitat Politécnica de Catalunya, Spain
Javier Lira
Universitat Rovira i Virgili, Spain
Carlos Molina
Intel Barcelona Research Center, Intel Labs, UPC, Spain
Antonio González

Authors

Javier Lira
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Molina
View author publications
You can also search for this author in PubMed Google Scholar
Antonio González
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Software Technology, Delft University of Technology, Mekelweg 4, 2628, Delft, CD, The Netherlands
Henk Sips , Dick Epema & Hai-Xiang Lin , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lira, J., Molina, C., González, A. (2009). Last Bank: Dealing with Address Reuse in Non-Uniform Cache Architecture for CMPs. In: Sips, H., Epema, D., Lin, HX. (eds) Euro-Par 2009 Parallel Processing. Euro-Par 2009. Lecture Notes in Computer Science, vol 5704. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03869-3_30

Download citation

DOI: https://doi.org/10.1007/978-3-642-03869-3_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03868-6
Online ISBN: 978-3-642-03869-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Last Bank: Dealing with Address Reuse in Non-Uniform Cache Architecture for CMPs

Abstract

Chapter PDF

Similar content being viewed by others

Towards Efficient Dynamic LLC Home Bank Mapping with NoC-Level Support

Reducing the second-level cache conflict misses using a set folding technique

COMPESCE: A Co-design Approach for Memory Subsystem Performance Analysis in HPC Many-Cores

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Last Bank: Dealing with Address Reuse in Non-Uniform Cache Architecture for CMPs

Abstract

Chapter PDF

Similar content being viewed by others

Towards Efficient Dynamic LLC Home Bank Mapping with NoC-Level Support

Reducing the second-level cache conflict misses using a set folding technique

COMPESCE: A Co-design Approach for Memory Subsystem Performance Analysis in HPC Many-Cores

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation