Abstract
Recent studies suggest that future microprocessors need low-cost fault-tolerance solutions for reliable operation. Several competing software-implemented error-detection methods have been shown to increase the overall resiliency when applied to critical spots in the system. Fault injection (FI) is a common approach to assess a system’s vulnerability to hardware faults. In an FI campaign comprising multiple runs of an application benchmark, each run simulates the impact of a fault in a specific hardware location at a specific point in time. Unfortunately, exhaustive FI campaigns covering all possible fault locations are infeasible even for small target applications. Commonly used sampling techniques, while sufficient to measure overall resilience improvements, lack the level of detail and accuracy needed for the identification of critical spots, such as important variables or program phases. Many faults are sampled out, leaving the developer without any information on the application parts they would have targeted.
We present a methodology and tool implementation that application-specifically reduces experimentation efforts, allows to freely trade the number of FI runs for result accuracy, and provides information on all possible fault locations. After training a set of Pareto-optimal heuristics, the experimenting user is enabled to specify a maximum number of FI experiments. A detailed evaluation with a set of benchmarks running on the eCos embedded OS, including MiBench’s automotive benchmark category, emphasizes the applicability and effectiveness of our approach: For example, when the user chooses to run only 1.5% of all FI experiments, the average result accuracy is still 99.84%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Borkar, S.Y.: Designing reliable systems from unreliable components: The challenges of transistor variability and degradation. IEEE Micro 25(6), 10–16 (2005)
Duranton, M., Yehia, S., de Sutter, B., de Bosschere, K., Cohen, A., Falsafi, B., Gaydadjiev, G., Katevenis, M., Maebe, J., Munk, H., Navarro, N., Ramirez, A., Temam, O., Valero, M.: The HiPEAC vision. Technical report, HiPEAC (2010)
Narayanan, V., Xie, Y.: Reliability concerns in embedded system designs. IEEE Comp. 39(1), 118–120 (2006)
Hari, S.K.S., Adve, S.V., Naeimi, H.: Low-cost program-level detectors for reducing silent data corruptions. In: 42nd IEEE/IFIP Int. Conf. on Dep. Sys. & Netw., DSN 2012. IEEE (2012)
Borchert, C., Schirmeier, H., Spinczyk, O.: Generative software-based memory error detection and correction for operating system data structures. In: 43rd IEEE/IFIP Int. Conf. on Dep. Sys. & Netw., DSN 2013. IEEE (June 2013)
Borchert, C., Schirmeier, H., Spinczyk, O.: Protecting the dynamic dispatch in C++ by dependability aspects. In: 1st GI W’shop on SW-Based Methods for Robust Embedded Sys., SOBRES 2012. LNI, pp. 521–535. German Society of Informatics (September 2012)
Borchert, C., Schirmeier, H., Spinczyk, O.: Return-address protection in C/C++ code by dependability aspects. In: 2nd GI W’shop on SW-Based Methods for Robust Embedded Sys., SOBRES 2013. LNI. German Society of Informatics (September 2013)
Arlat, J., Aguera, M., Amat, L., Crouzet, Y., Fabre, J.C., Laprie, J.C., Martins, E., Powell, D.: Fault injection for dependability validation: A methodology and some applications. IEEE TOSE 16(2), 166–182 (1990)
Benso, A., Prinetto, P.: Fault injection techniques and tools for embedded systems reliability evaluation. Frontiers in electronic testing. Kluwer, Boston (2003)
Leveugle, R., Calvez, A., Maistri, P., Vanhauwaert, P.: Statistical fault injection: Quantified error and confidence. In: IEEE 2009 Conf. on Design, Autom. & Test in Europe, DATE 2009, pp. 502–506 (2009)
Ramachandran, P., Kudva, P., Kellington, J., Schumann, J., Sanda, P.: Statistical fault injection. In: 38th IEEE/IFIP Int. Conf. on Dep. Sys. & Netw., DSN 2008, pp. 122–127. IEEE (2008)
Schirmeier, H., Hoffmann, M., Kapitza, R., Lohmann, D., Spinczyk, O.: FAIL*: Towards a versatile fault-injection experiment framework. In: Mühl, G., Richling, J., Herkersdorf, A. (eds.) 25th Int. Conf. on Arch. of Comp. Sys., ARCS 2012, Workshop Proceedings. LNI, vol. 200, pp. 201–210. German Society of Informatics (March 2012)
Massa, A.: Embedded Software Development with eCos. Prentice Hall (2002)
Guthaus, M.R., Ringenberg, J.S., Ernst, D., Austin, T.M., Mudge, T., Brown, R.B.: MiBench: A free, commercially representative embedded benchmark suite. In: IEEE Int. W’shop. on Workload Characterization (WWC 2001), pp. 3–14. IEEE, Washington, DC (2001)
Mukherjee, S.: Architecture Design for Soft Errors. Morgan Kaufmann (2008)
Smith, D.T., Johnson, B.W., Profeta III., J.A., Bozzolo, D.G.: A method to determine equivalent fault classes for permanent and transient faults. In: Annual Reliability and Maintainability Symposium, pp. 418–424 (January 1995)
Benso, A., Rebaudengo, M., Impagliazzo, L., Marmo, P.: Fault-list collapsing for fault-injection experiments. In: Annual Reliability and Maintainability Symposium (January 1998)
Berrojo, L., Gonzalez, I., Corno, F., Reorda, M., Squillero, G., Entrena, L., Lopez, C.: New techniques for speeding-up fault-injection campaigns. In: 2002 Conf. on Design, Autom. & Test in Europe, DATE 2002, pp. 847–852 (2002)
Barbosa, R., Vinter, J., Folkesson, P., Karlsson, J.: Assembly-level pre-injection analysis for improving fault injection efficiency. In: Dal Cin, M., Kaâniche, M., Pataricza, A. (eds.) EDCC 2005. LNCS, vol. 3463, pp. 246–262. Springer, Heidelberg (2005)
Grinschgl, J., Krieg, A., Steger, C., Weiss, R., Bock, H., Haid, J.: Efficient fault emulation using automatic pre-injection memory access analysis. In: SOC Conference, pp. 277–282 (2012)
Hari, S.K.S., Adve, S.V., Naeimi, H., Ramachandran, P.: Relyzer: Exploiting application-level fault equivalence to analyze application resiliency to transient faults. In: 17th Int. Conf. on Arch. Support for Programming Languages and Operating Systems, ASPLOS 2012, pp. 123–134. ACM, New York (2012)
Döbel, B., Schirmeier, H., Engel, M.: Investigating the limitations of PVF for realistic program vulnerability assessment. In: 5rd HiPEAC W’shop on Design for Reliability (DFR 2013), Berlin, Germany (January 2013)
Li, J., Tan, Q.: SmartInjector: Exploiting intelligent fault injection for SDC rate analysis. In: IEEE Int. Symp. on Defect & Fault Tol. in VLSI & Nanotech. Sys., DFT 2013 (2013)
Lawton, K.P.: Bochs: A portable PC emulator for Unix/X. Linux Journal 1996(29es) (1996)
Zitzler, E., Laumanns, M., Thiele, L.: SPEA2: Improving the strength pareto evolutionary algorithm for multiobjective optimization. In: Giannakoglou, K.C., Tsahalis, D.T., Périaux, J., Papailiou, K.D., Fogarty, T. (eds.) Evolutionary Methods for Design Optimization and Control with Applications to Industrial Problems, Athens, Greece. International Center for Numerical Methods in Engineering, pp. 95–100 (September 2001)
Bleuler, S., Laumanns, M., Thiele, L., Zitzler, E.: PISA — a platform and programming language independent interface for search algorithms. In: Fonseca, C.M., Fleming, P.J., Zitzler, E., Deb, K., Thiele, L. (eds.) EMO 2003. LNCS, vol. 2632, pp. 494–508. Springer, Heidelberg (2003)
Mitchell, M.: An Introduction to Genetic Algorithms. MIT Press (1998)
Hoffmann, M., Borchert, C., Dietrich, C., Schirmeier, H., Kapitza, R., Spinczyk, O., Lohmann, D.: Effectiveness of fault detection mechanisms in static and dynamic operating system designs. In: 17th IEEE Int. Symp. on OO Real-Time Distrib. Computing, ISORC 2014. IEEE (2014)
Smith, D.T., Johnson, B.W., Andrianos, N., Profeta III., J.A.: A variance-reduction technique via fault-expansion for fault-coverage estimation. IEEE TR 46(3), 366–374 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Schirmeier, H., Borchert, C., Spinczyk, O. (2014). Rapid Fault-Space Exploration by Evolutionary Pruning. In: Bondavalli, A., Di Giandomenico, F. (eds) Computer Safety, Reliability, and Security. SAFECOMP 2014. Lecture Notes in Computer Science, vol 8666. Springer, Cham. https://doi.org/10.1007/978-3-319-10506-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-10506-2_2
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10505-5
Online ISBN: 978-3-319-10506-2
eBook Packages: Computer ScienceComputer Science (R0)