Abstract
Exponential growth in the number of transistors for each chip along with increasing clock frequencies and operational voltages and decreasing load capacitance are aggravating the possibility of occurrence of soft errors in embedded systems. Transistors on current chips have components separated by only a few hundred atoms; hence, a small voltage glitch can alter the state of the transistor, thus causing soft errors in the systems. The impact will be a matter of great concern when the line widths will shrink further. These complicated linkages among the components in chips directly affect the reliability of embedded systems and cause them to be sensitive to soft errors. The common approach to address such errors is focused on post-design phases that are complex and costly to implement. However, reliability, which is a vital non-functional attribute of a system, should be validated at the design phase, particularly for critical systems. This paper proposes an efficient approach to measure and minimize the potential threats of soft errors for embedded systems in the early design phase of system-level design. The methodology is validated against a system model that must have high reliability.
Similar content being viewed by others
References
Ghribi I, et al. R-codesign: codesign methodology for real-time reconfigurable embedded systems under energy constraints. IEEE Access. 2018;6:14078–92.
Tan B, Biglari-Abhari M, Salcic Z. An automated security-aware approach for design of embedded systems on MPSoC. ACM Trans Embed Comput Syst. 2017;16(5s):1–20.
Ahammed S, et al. Soft error tolerance using HVDQ (Horizontal-Vertical-Diagonal-Queen parity method). Comput Syst Sci Eng. 2017;32(1):35–47.
Baumann R. Soft errors in commercial semiconductor technology: overview and scaling trends. In: IEEE 2002 reliability physics tutorial notes, reliability fundamentals, vol. 7 (2002)
Katoen J-P. Quantitative evaluation in embedded system design: trends in modeling and analysis techniques. In: 2008 design, automation and test in Europe, IEEE (2008)
Van Harten LD, Mousavi M, Jordans R, Pourshaghaghi HR. Determining the necessity of fault tolerance techniques in FPGA devices for space missions. Microprocess Microsyst. 2018;63:1–10.
Pratt B, Caffrey M, Graham P, Morgan K, Wirthlin M. Improving FPGA design robustness with partial TMR. In: 2006 IEEE ınternational reliability physics symposium proceedings, IEEE, pp. 226–232 (2006)
Harten V, Khatri AR, Hayek A, Börcsök J. Validation of the proposed hardness analysis technique for FPGA designs to improve reliability and fault-tolerance. Int J Adv Comput Sci Appl. 2018;9(12):1–8.
Gu C, Hanley N, O’neill M. Improved reliability of FPGA-based PUF identification generator design. ACM Trans Reconfig Technol Syst. 2017;10(3):1–23.
Anwer J, Platzner M. Evaluating fault-tolerance of redundant FPGA structures using Boolean difference calculus. Microprocess Microsyst. 2017;52:160–72.
Majzik I, Pataricza A, Bondavalli A. Stochastic dependability analysis of system architecture based on UML models. Archit Depend Syst LNCS. 2003;2677:219–219.
Sadi MS, Myers DG, Sanchez CO, Jurjens J. Component criticality analysis to minimizing soft errors risk. Comput Syst Sci Eng. 2010;26(1):377–91.
Weulersse C, et al. Contribution of thermal neutrons to soft error rate. IEEE Trans Nucl Sci. 2018;65(8):1851–7.
Jung D, Sharma A, Jung J. A review of soft errors and the low α-solder bumping process in 3-D packaging technology. J Mater Sci. 2018;53(1):47–65.
Irom F, et al. Single-event upset in evolving commercial silicon-on-insulator microprocessor technologies. IEEE Trans Nucl Sci. 2003;50(6):2107–12.
Baumann RC. Radiation-induced soft errors in advanced semiconductor technologies. IEEE Trans Device Mater Reliab. 2005;5(3):305–16.
Mukherjee S, Emer J, Reinhardt SK. The soft error problem: an architectural perspective. In: 11th International symposium on high-performance computer architecture, IEEE (2005)
Park S, Li S, Mahlke S. Low cost transient fault protection using loop output prediction. In: 2018 48th Annual IEEE/IFIP international conference on dependable systems and networks workshops (DSN-W), IEEE (2018)
Mukherjee SS, Kontz M, Reinhardt SK. Detailed design and evaluation of redundant multi-threading alternatives. In Proceedings 29th annual international symposium on computer architecture, IEEE (2002)
Diehl S, et al. Error analysis and prevention of cosmic ion-induced soft errors in static CMOS RAMs. IEEE Trans Nucl Sci. 1982;29(6):2032–9.
Liu MN. Low power SEU immune CMOS memory circuits. IEEE Trans Nucl Sci. 1992;39(6):1679–84.
Calin T. Upset hardened memory design for submicron CMOS technology. IEEE Trans Nucl Sci. 1996;43(6):2874–8.
Gomaa M et al. Transient-fault recovery for chip multiprocessors. In: 30th Annual international symposium on computer architecture, 2003. Proceedings of IEEE (2003)
Srinivasan J, et al. The case for lifetime reliability-aware microprocessors. ACM SIGARCH Comput Archit News. 2004;32(2):276.
Rashid MW, et al. Power-efficient error tolerance in chip multiprocessors. IEEE Micro. 2005;25(6):60–70.
Bowles JB. An assessment of RPN prioritization in a failure modes effects and criticality analysis. In: Annual reliability and maintainability symposium, 2003, IEEE (2003).
Military Standard, US. Procedures for performing a failure mode, effects and criticality analysis. MIL-STD-1629A. 1980.
Bowles JB. The new SAE FMECA standard. In: Annual reliability and maintainability symposium. 1998 Proceedings. International symposium on product quality and integrity, IEEE (1998)
Avizienis A, et al. Basic concepts and taxonomy of dependable and secure computing. IEEE Trans Depend Secure Comput. 2004;1(1):11–33.
Nguyen HT, et al. Chip-level soft error estimation method. IEEE Trans Device Mater Reliab. 2005;5(3):365–81.
Yacoub SM, Ammar HH. A methodology for architecture-level reliability risk analysis. IEEE Trans Softw Eng. 2002;28(6):529–47.
Wagner S, Jürjens J. Model-based identification of fault-prone components. In: European dependable computing conference, Springer (2005)
Hosseini SM, et al. Reprioritization of failures in a system failure mode and effects analysis by decision making trial and evaluation laboratory technique. Reliab Eng Syst Saf. 2006;91:872–81.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sadi, M.S., Ahmed, W. & Jürjens, J. Towards Tolerating Soft Errors for Embedded Systems. SN COMPUT. SCI. 2, 101 (2021). https://doi.org/10.1007/s42979-021-00497-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s42979-021-00497-9