Approximate Cache Architectures

Jerger, Natalie Enright; Miguel, Joshua San

doi:10.1007/978-3-319-99322-5_20

Natalie Enright Jerger³ &
Joshua San Miguel⁴

1738 Accesses
1 Altmetric

Abstract

In this chapter, we explore the application of approximate computing techniques to caches and the memory access portion of the processor pipeline. As memory accesses contribute significantly to the latency and energy consumption of applications, they have long been the target of various optimizations. Large cache hierarchies are a mainstay in modern designs in order to avoid the long latency and high energy associated with accessing DRAM on every load or store request. With growing data set sizes, building ever larger caches is not necessarily an effective use of silicon real estate. We present recent work that improves the effectiveness of cache storage and reduces the cost of memory accesses by exploiting the inherently noisy or imprecise data that these applications operate on. First, we consider work that selectively forgoes loading data from the caches and memory when the processor can make a reasonable estimate of the value that is needed. Next, we explore work that selectively determines which values to store in the cache through approximate deduplication of data; by reducing how much data needs to be stored in the cache, we see an increase in the effective cache capacity.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Alameldeen A, Wood DA (2004) Adaptive cache compression for high-performance processors. In: International symposium on computer architecture
Google Scholar
Albericio J, Ibanez P, Vinals V, Llaberia JM (2013) The reuse cache: downsizing the shared last-level cache. In: Proceedings of the international symposium on microarchitecture
Google Scholar
Alvarez C, Corbal J, Valero M (2005) Fuzzy memoization for floating-point multimedia applications. IEEE Trans Comput 54:922–927
Article Google Scholar
Biswas S, Franklin D, Savage A, Dixon R, Sherwood T, Chong F (2009) Multi-execution: multicore caching for data-similar executions. In: Proceedings of the international symposium on computer architecture
Google Scholar
Burtscher M (2000) Improving context-based load value prediction. PhD Thesis, University of Colorado
Google Scholar
Ceze L, Strauss K, Tuck J, Torrellas J, Renau J (2006) CAVA: using checkpoint-assisted value prediction to hide L2 misses. ACM Trans Archit Code Optim 3:182–208
Article Google Scholar
Chen X, Yang L, Dick RP, Shang L, Lekatsas H (2010) C-pack: a high-performance microprocessor cache compression algorithm. IEEE Trans Very Large Scale Integr 18:8
Google Scholar
Falsafi B, Wenisch T (2014) A Primer on hardware prefetching. Morgan Claypool, San Rafael
Article Google Scholar
Fluhr E, Friedrich J, Dreps D, Zyuban V, Still G, Gonzalez C, Hall A, Hogenmiller D, Malgioglio F, Nett R, Paredes J, Pille J, Plass D, Puri R, Restle P, Shan D, Stawiasz K, Deniz ZT, Wendel D, Ziegler M (2014) POWER8^TM: a 12-core server-class processor in 22nm SOI with 7.6tb/s off-chip bandwidth. In: Proceedings of the international solid state circuits conference
Google Scholar
Gabbay F (1996) Speculative execution based on value prediction. EE Department Technical Report 1080, Technion - Israel Institute of Technology
Google Scholar
Hallnor E, Reinhardt S (2005) A unified compressed memory hierarchy. In: Proceedings of the international symposium on high performance computer architecture
Google Scholar
Hammarlund P, Martinez A, Bajwa A, Hill D, Hallnor E, Jiang H, Dixon M, Derr M, Hunsaker M, Kumar R, Osborne R, Rajwar R, Singhal R, D’Sa R, Chappell R, Kaushik S, Chennupaty S, Jourdan S, Gunther S, Piazza T, Burton T (2014) Haswell: the fourth-generation intel core processor. IEEE Micro 34:2
Article Google Scholar
Jaleel A, Theobald KB, Steely SC Jr, Emer J (2010) High performance cache replacement using re-reference interval prediction (RRIP). In: proceedings of the 38th international symposium on computer architecture
Google Scholar
Khan SM, Tian Y, Jiménez DA (2010) Dead block replacement and bypass with a sampling predictor. In: Proceedings of the 43rd international symposium on microarchitecture
Google Scholar
Kharbutli M, Irwin K, Solihin Y, Lee J (2004) Using prime numbers for cache indexing to eliminate conflict misses. In: HPCA
Google Scholar
Kleanthous M, Sazeides Y (2008) CATCH: a mechanism for dynamically detecting cache-content-duplication and its application to instruction caches. In: Proceedings of the conference on design automation and test in Europe
Google Scholar
Lipasti MH, Wilkerson CB, Shen JP (1996) Value locality and load value prediction. In: Proceedings of the international conference architectural support for programming languages and operating systems
Google Scholar
Liu S, Gaudiot J (2009) Potential impact of value prediction on communication in many-core architectures. IEEE Trans Comput 58:759–769
Article MathSciNet Google Scholar
Martin MMK, Sorin DJ, Cain HW, Hill MD, Lipasti MH (2001) Correctly implementing value prediction in microprocessors that support multithreading or multiprocessing. In: Proceedings of the international symposium on microarchitecture
Google Scholar
Nakra T, Gupta R, Soffa ML (1999) Global context-based value prediction. In: Proceedings of the international symposium high-performance computer architecture
Google Scholar
Pekhimenko G, Seshadr V, Mutlu O, Kozuch M, Gibbons PB, Mowry TC (2012) Base-delta-immediate compression: Practical data compression for on-chip caches. In: Proceedings of the international conference on parallel architecture and compilation techniques
Google Scholar
Qureshi MK, Jaleel A, Patt YN, Steely SC Jr, Emer J (2007) Adaptive insertion policies for high performance caching. In: Proceedings of the 34th international symposium on computer architecture
Google Scholar
San Miguel J, Badr M, Enright Jerger N (2014) Load value approximation. In: International symposium on microarchitecture
Google Scholar
San Miguel J, Albericio J, Moshovos A, Enright Jerger N (2015) Doppelganger: a cache for approximate computing. In: MICRO
Google Scholar
San Miguel J, Albericio J, Enright Jerger N, Jaleel A (2016) The bunker cache for spatio-value approximation. In: MICRO
Google Scholar
Sardashti S, Wood DA (2013) Decoupled compressed cache: exploiting spatial locality for energy-optimized compressed caching. In: International symposium on microarchitecture
Google Scholar
Sardashti S, Seznec A, Wood DA (2014) Skewed compressed cache. In: International symposium on microarchitecture
Google Scholar
Sazeides Y, Smith J (1997) The predictability of data values. In: Proceedings of the international symposium microarchitecture
Google Scholar
Sendag R, Chuang P-F, Lilja D (2003) Address correlation: exceeding the limits of locality. IEEE Comput Archit Lett 2:3–3
Article Google Scholar
Seznec A (1993) A case for two-way skewed-associative caches. In: Proceedings of the international symposium computer architecture
Google Scholar
Sreeram J, Pande S (2010) Exploiting approximate value locality for data synchronization on multi-core processors. In: Proceedings of the international symposium workload characterization
Google Scholar
Thwaites B, Pekhimenko G, Esmaeilzadeh H, Yazdanbakhsh A, Mutlu O, Park J, Mururu G, Mowry T (2014) Rollback-free value prediction with approximate loads. Poster presented at PACT
Google Scholar
Tian Y, Khan S, Jimenez D, Loh G (2014) Last-level cache deduplication. In: Proceedings of the international conference on supercomputing
Google Scholar
Tong JYF, Nagle D, Rutenbar RA (2000) Reducing power by optimizing the necessary precision/range of floating-point arithmetic. IEEE Trans Very Large Scale Integr Syst 8:273–286
Article Google Scholar
Wong D, Kim NS, Annavaram M (2016) Approximating warps with intra-warp operand value similarity. In: Proceedings of the international symposium on high performance computer architecture
Google Scholar
Wu CJ, Jaleel A, Martonosi M, Steely S Jr, Emer J (2011) PACMan: prefetch-aware cache management for high performance caching. In: Proceedings of the international symposium on microarchitecture
Google Scholar
Yazdanbakhsh A, Pekhimenko G, Thwaites B, Esmaeilzadeh H, Mutlu O, Mowry TC (2016) RFVP: rollback-free value prediction with safe-to-approximate loads. ACM Trans Archit Code Optim 12:4
Article Google Scholar
Zhang Y, Yang J, Gupta R (2000) Frequent value locality and value-centric data cache design. ACM SIGOPS Oper Syst Rev 34:150–159
Article Google Scholar
Zhou H, Flanagan J, Conte TM (2003) Detecting global stride locality in value streams. In: Proceedings of the international symposium computer architecture
Google Scholar

Download references

Author information

Authors and Affiliations

University of Toronto, Toronto, ON, Canada
Natalie Enright Jerger
University of Wisconsin-Madison, Madison, WI, USA
Joshua San Miguel

Authors

Natalie Enright Jerger
View author publications
You can also search for this author in PubMed Google Scholar
Joshua San Miguel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Natalie Enright Jerger .

Editor information

Editors and Affiliations

Brown University, Rhode Island, Providence, USA
Sherief Reda
Vienna University of Technology, Wien, Wien, Austria
Muhammad Shafique

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Jerger, N.E., Miguel, J.S. (2019). Approximate Cache Architectures. In: Reda, S., Shafique, M. (eds) Approximate Circuits. Springer, Cham. https://doi.org/10.1007/978-3-319-99322-5_20

Download citation

DOI: https://doi.org/10.1007/978-3-319-99322-5_20
Published: 06 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99321-8
Online ISBN: 978-3-319-99322-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics