Mitigating the Effect of Runtime-Failures in MBC Frameworks

Paul, Somnath; Bhunia, Swarup

doi:10.1007/978-1-4614-7798-3_19

Somnath Paul³ &
Swarup Bhunia⁴

815 Accesses

Abstract

In nanoscale technologies, memories are vulnerable to both parametric failures, as well as, runtime failures induced by “soft errors” such as voltage, or thermal noise and aging effects. This chapter addresses the runtime failures in on-chip memories induced by “soft errors”. Conventionally such runtime errors are addressed using single error correction double error detection (SECDED) codes. However, these codes have very limited correction capability, making them inefficient to protect memory in scaled technologies (sub-45 nm), which are particularly vulnerable to multiple-bit failures (Hareland et al. Characterization of Multi-bit Soft Error events in advanced SRAMs, Intl. Electron Devices Meeting, 2003; Osada et al. IEEE J Solid State Circ 39(5), 2004). The requirement to tolerate multi-bit failures is accentuated by inter-die and intra-die variation in memory blocks which increases the vulnerability towards runtime failures. This chapter explores architectural modifications and a novel reconfigurable error-control coding (ECC) technique which together can be extremely effective in protecting on-chip memories against runtime failures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

J. Maiz, S. Hareland, K. Zhang, P. Armstrong, “Characterization of Multi-bit Soft Error events in advanced SRAMs”, in Intl. Electron Devices Meeting, 2003
Google Scholar
C.W. Slayman, “Cache and memory error detection, correction, and reduction techniques for terrestrial servers and workstations”. IEEE Trans. Device Mater. Reliab. 5(3), 397–404 (2005)
Article Google Scholar
K. Osada, K. Yamaguchi, Y. Saitoh, “SRAM immunity to cosmic-ray-induced multierrors based on analysis of an induced parasitic bipolar effect”. IEEE J. Solid State Circ. 39(5), 827–833 (2004)
Article Google Scholar
D.M. Kwai et al., “Detection of SRAM Cell Stability by Lowering Array Supply Voltage”, in Asian Test Symposium, 2000
Google Scholar
A. Pavlov et al., “Weak cell detection in deep-submicron SRAMs: A programmable detection technique”. IEEE J. Solid State Circ. 41(10), 2334–2343 (2006)
Article Google Scholar
S. Mukhopadhyay, K. Kim, H. Mahmoodi, K. Roy, “Design of a process variation tolerant self-repairing SRAM for yield enhancement in nanoscaled CMOS”. IEEE J. Solid State Circ. 42(6), 1370–1382 (2007)
Article Google Scholar
S. Paul, F. Cai, X. Zhang, S. Bhunia, “Reliability-driven ECC allocation for multiple bit error resilience in processor cache”. IEEE Trans. Comput. 60(1), 20–34 (2011)
Article MathSciNet Google Scholar
S. Rusu, H. Muljono, B. Cherkauer, “Itanium 2 Processor 6M: Higher Frequency and Larger L3 Cache, in Intl. Symposium on Microarchitecture”, 2004
Google Scholar
J.L. Shin, B. Petrick, M. Singh, A.S. Leon, “Design and implementation of an embedded 512-KB Level-2 cache subsystem”. IEEE J. Solid State Circ. 40(9), 349–352 (2005)
Article Google Scholar
J. Kim, N. Hardavellas, K. Mai, B. Falsafi, J.C. Hoe, “Multi-bit Error Tolerant Caches Using Two-Dimensional Error Coding”, in Intl. Symposium on Microarchitecture, 2007
Google Scholar
Z. Chisti, A.R. Alameldeen, C. Wilkerson, W. Wu, S. Lu, “Improving Cache Lifetime Reliability at Ultra-low voltages”, in Intl. Symposium on Microarchitecture, 2009
Google Scholar
D. Brooks, V. Tiwari, M. Martonosi, “Wattch: A Framework for Architectural-Level Power Analysis and Optimizations”, in Intl. Symposium on Computer Architecture, 2000
Google Scholar

Download references

Author information

Authors and Affiliations

Intel Labs, Hillsboro, OR, USA
Somnath Paul
Department of EECS, Case Western Reserve University, Cleveland, OH, USA
Swarup Bhunia

Authors

Somnath Paul
View author publications
You can also search for this author in PubMed Google Scholar
Swarup Bhunia
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Paul, S., Bhunia, S. (2014). Mitigating the Effect of Runtime-Failures in MBC Frameworks. In: Computing with Memory for Energy-Efficient Robust Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7798-3_19

Download citation

DOI: https://doi.org/10.1007/978-1-4614-7798-3_19
Published: 08 August 2013
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-7797-6
Online ISBN: 978-1-4614-7798-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics