Abstract
In nanoscale technologies, memories are vulnerable to both parametric failures, as well as, runtime failures induced by “soft errors” such as voltage, or thermal noise and aging effects. This chapter addresses the runtime failures in on-chip memories induced by “soft errors”. Conventionally such runtime errors are addressed using single error correction double error detection (SECDED) codes. However, these codes have very limited correction capability, making them inefficient to protect memory in scaled technologies (sub-45 nm), which are particularly vulnerable to multiple-bit failures (Hareland et al. Characterization of Multi-bit Soft Error events in advanced SRAMs, Intl. Electron Devices Meeting, 2003; Osada et al. IEEE J Solid State Circ 39(5), 2004). The requirement to tolerate multi-bit failures is accentuated by inter-die and intra-die variation in memory blocks which increases the vulnerability towards runtime failures. This chapter explores architectural modifications and a novel reconfigurable error-control coding (ECC) technique which together can be extremely effective in protecting on-chip memories against runtime failures.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
J. Maiz, S. Hareland, K. Zhang, P. Armstrong, “Characterization of Multi-bit Soft Error events in advanced SRAMs”, in Intl. Electron Devices Meeting, 2003
C.W. Slayman, “Cache and memory error detection, correction, and reduction techniques for terrestrial servers and workstations”. IEEE Trans. Device Mater. Reliab. 5(3), 397–404 (2005)
K. Osada, K. Yamaguchi, Y. Saitoh, “SRAM immunity to cosmic-ray-induced multierrors based on analysis of an induced parasitic bipolar effect”. IEEE J. Solid State Circ. 39(5), 827–833 (2004)
D.M. Kwai et al., “Detection of SRAM Cell Stability by Lowering Array Supply Voltage”, in Asian Test Symposium, 2000
A. Pavlov et al., “Weak cell detection in deep-submicron SRAMs: A programmable detection technique”. IEEE J. Solid State Circ. 41(10), 2334–2343 (2006)
S. Mukhopadhyay, K. Kim, H. Mahmoodi, K. Roy, “Design of a process variation tolerant self-repairing SRAM for yield enhancement in nanoscaled CMOS”. IEEE J. Solid State Circ. 42(6), 1370–1382 (2007)
S. Paul, F. Cai, X. Zhang, S. Bhunia, “Reliability-driven ECC allocation for multiple bit error resilience in processor cache”. IEEE Trans. Comput. 60(1), 20–34 (2011)
S. Rusu, H. Muljono, B. Cherkauer, “Itanium 2 Processor 6M: Higher Frequency and Larger L3 Cache, in Intl. Symposium on Microarchitecture”, 2004
J.L. Shin, B. Petrick, M. Singh, A.S. Leon, “Design and implementation of an embedded 512-KB Level-2 cache subsystem”. IEEE J. Solid State Circ. 40(9), 349–352 (2005)
J. Kim, N. Hardavellas, K. Mai, B. Falsafi, J.C. Hoe, “Multi-bit Error Tolerant Caches Using Two-Dimensional Error Coding”, in Intl. Symposium on Microarchitecture, 2007
Z. Chisti, A.R. Alameldeen, C. Wilkerson, W. Wu, S. Lu, “Improving Cache Lifetime Reliability at Ultra-low voltages”, in Intl. Symposium on Microarchitecture, 2009
D. Brooks, V. Tiwari, M. Martonosi, “Wattch: A Framework for Architectural-Level Power Analysis and Optimizations”, in Intl. Symposium on Computer Architecture, 2000
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this chapter
Cite this chapter
Paul, S., Bhunia, S. (2014). Mitigating the Effect of Runtime-Failures in MBC Frameworks. In: Computing with Memory for Energy-Efficient Robust Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7798-3_19
Download citation
DOI: https://doi.org/10.1007/978-1-4614-7798-3_19
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-7797-6
Online ISBN: 978-1-4614-7798-3
eBook Packages: EngineeringEngineering (R0)