Advertisement

AN-Encoding Compiler: Building Safety-Critical Systems with Commodity Hardware

  • Christof Fetzer
  • Ute Schiffel
  • Martin Süßkraut
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5775)

Abstract

In the future, we expect commodity hardware to be used in safety-critical applications. However, in the future commodity hardware is expected to become less reliable and more susceptible to soft errors because of decreasing feature size and reduced power supply. Thus, software-implemented approaches to deal with unreliable hardware will be needed. To simplify the handling of value failures, we provide failure virtualization in the sense that we transform arbitrary value failures caused by erroneous execution into fail-stop failures. The latter ones are easier to handle. Therefore, we use the arithmetic AN-code because it provides very good error detection capabilities. Arithmetic codes are suitable for the protection of commodity hardware because guarantees can be provided independent of the executing hardware. This paper presents the encoding compiler EC-AN which applies AN-encoding to arbitrary programs. According to our knowledge, this is the first in software implemented complete AN-encoding. Former encoding compilers either encode only small parts of applications or trade-off safety to enable complete AN-encoding.

Keywords

Soft Error Arithmetic Code Commodity Hardware Dynamic Binary Instrumentation Decrease Feature Size 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Avizienis, A.: Arithmetic error codes: Cost and effectiveness studies for application in digital system design. Transactions on Computers (1971)Google Scholar
  2. 2.
    Bagchi, S., Kalbarczyk, Z., Iyer, R., Levendel, Y.: Design and evaluation of preemptive control signature(PECOS) checking. IEEE Transactions on Computers (2003)Google Scholar
  3. 3.
    Barnaby, H.J.: Will radiation-hardening-by-design (RHBD) work? Nuclear and Plasma Sciences, Society News (2005)Google Scholar
  4. 4.
    Bernick, D., Bruckert, B., Vigna, P.D., Garcia, D., Jardine, R., Klecka, J., Smullen, J.: Nonstop advanced architecture. In: Proceedings of the International Conference on Dependable Systems and Networks, DSN (2005)Google Scholar
  5. 5.
    Blum, M., Luby, M., Rubinfeld, R.: Self-testing/correcting with applications to numerical problems. In: STOC 1990: Proceedings of the twenty-second annual ACM symposium on Theory of computing. ACM Press, New York (1990)Google Scholar
  6. 6.
    Bolchini, C., Miele, A., Rebaudengo, M., Salice, F., Sciuto, D., Sterpone, L., Violante, M.: Software and hardware techniques for SEU detection in IP processors. J. Electron. Test. 24(1-3), 35–44 (2008)CrossRefGoogle Scholar
  7. 7.
    Borin, E., Wang, C., Wu, Y., Araujo, G.: Software-based transparent and comprehensive control-flow error detection. In: Proceedings of the International Symposium on Code Generation and Optimization (CGO), pp. 333–345. IEEE Computer Society, Washington (2006)CrossRefGoogle Scholar
  8. 8.
    Borkar, S.: Designing reliable systems from unreliable components: The challenges of transistor variability and degradation. IEEE Micro (2005)Google Scholar
  9. 9.
    Budiu, M., Erlingsson, Ú., Abadi, M.: Architectural support for software-based protection. In: ASID 2006: Proceedings of the 1st workshop on Architectural and system support for improving software dependability, pp. 42–51. ACM, New York (2006)CrossRefGoogle Scholar
  10. 10.
    Chang, J., Reis, G.A., August, D.I.: Automatic instruction-level software-only recovery. In: Proceedings of the International Conference on Dependable Systems and Networks (DSN), Washington, USA (2006)Google Scholar
  11. 11.
    Forin, P.: Vital coded microprocessor principles and application for various transit systems. In: IFA-GCCT, September 1989, pp. 79–84 (1989)Google Scholar
  12. 12.
    Gomaa, M., Scarbrough, C., Vijaykumar, T.N., Pomeranz, I.: Transient-fault recovery for chip multiprocessors. In: International Symposium on Computer Architecture (2003)Google Scholar
  13. 13.
    Huang, K.-H., Abraham, J.A.: Algorithm-based fault tolerance for matrix operations. IEEE Trans. Computers 33(6), 518–528 (1984)CrossRefzbMATHGoogle Scholar
  14. 14.
    Lattner, C., Adve, V.: LLVM: A compilation framework for lifelong program analysis & transformation. In: Proceedings of the international symposium on Code generation and optimization (CGO), Washington, DC, USA, vol. 75. IEEE Computer Society, Los Alamitos (2004)Google Scholar
  15. 15.
    Li, X., Gaudiot, J.-L.: A compiler-assisted on-chip assigned-signature control flow checking. In: Asia-Pacific Computer Systems Architecture Conference (2004)Google Scholar
  16. 16.
    Michalak, S.E., Harris, K.W., Hengartner, N.W., Takala, B.E., Wender, S.A.: Predicting the number of fatal soft errors in Los Alamos National Laboratory’s ASC Q supercomputer. In: IEEE Transactions on Device and Materials Reliability (2005)Google Scholar
  17. 17.
    Mitra, S.: Globally optimized robust systems to overcome scaled CMOS reliability challenges. In: Design, Automation and Test in Europe, DATE 2008 (2008)Google Scholar
  18. 18.
    Mitra, S., Seifert, N., Zhang, M., Shi, Q., Kim, K.S.: Robust system design with built-in soft-error resilience. Computer 38(2), 43–52 (2005)CrossRefGoogle Scholar
  19. 19.
    Nicolescu, B., Velazco, R.: Detecting soft errors by a purely software approach: Method, tools and experimental results. In: Design, Automation and Test in Europe, DATE 2003 (2003)Google Scholar
  20. 20.
    Oh, N., Mitra, S., McCluskey, E.J.: ED4I: Error detection by diverse data and duplicated instructions. IEEE Trans. Comput. 51 (2002)Google Scholar
  21. 21.
    Pattabiraman, K., Grover, V., Zorn, B.G.: Samurai: protecting critical data in unsafe languages. In: Eurosys 2008: Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems 2008, pp. 219–232. ACM, New York (2008)CrossRefGoogle Scholar
  22. 22.
    Rebaudengo, M., Reorda, M.S., Violante, M., Torchiano, M.: A source-to-source compiler for generating dependable software. In: Proceedings of the First IEEE International Workshop on Source Code Analysis and Manipulation, SCAM (2001)Google Scholar
  23. 23.
    Reddy, V., Rotenberg, E.: Inherent time redundancy (itr): Using program repetition for low-overhead fault tolerance. In: DSN 2007: Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, Washington, DC, USA. IEEE Computer Society, Los Alamitos (2007)Google Scholar
  24. 24.
    Reis, G.A., Chang, J., August, D.I., Cohn, R., Mukherjee, S.S.: Configurable transient fault detection via dynamic binary translation. In: Proceedings of the 2nd Workshop on Architectural Reliability, WAR (2006)Google Scholar
  25. 25.
    Reis, G.A., Chang, J., Vachharajani, N., Rangan, R., August, D.I., Mukherjee, S.S.: Design and evaluation of hybrid fault-detection systems. In: ISCA 2005: Proceedings of the 32nd annual international symposium on Computer Architecture, Washington, USA. IEEE Computer Society, Los Alamitos (2005)Google Scholar
  26. 26.
    Rhod, E.L., Lisbôa, C.A., Carro, L., Reorda, M.S., Violante, M.: Hardware and software transparency in the protection of programs against SEUs and SETs. J. Electron. Test. 24(1-3), 45–56 (2008)CrossRefGoogle Scholar
  27. 27.
    Slegel, T.J., Averill, R.M., Check, M.A., Giamei, B.C., Krumm, B.W., Krygowski, C.A., Li, W.H., Liptay, J.S., MacDougall, J.D., McPherson, T.J., Navarro, J.A., Schwarz, E.M., Shum, K., Webb, C.F.: IBM’s S/390 G5 microprocessor design. IEEE Micro 19, 12–23 (1999)CrossRefGoogle Scholar
  28. 28.
    Stefanidis, V.K., Margaritis, K.G.: Algorithm based fault tolerance: Review and experimental study. In: International Conference of Numerical Analysis and Applied Mathematics (2004)Google Scholar
  29. 29.
    Vemu, R., Abraham, J.A.: CEDA: Control-flow error detection through assertions. In: IOLTS 2006: Proceedings of the 12th IEEE International Symposium on On-Line Testing, Washington, DC, USA. IEEE Computer Society, Los Alamitos (2006)Google Scholar
  30. 30.
    Venkatasubramanian, R., Hayes, J.P., Murray, B.T.: Low-cost on-line fault detection using control flow assertions. In: Proceedings of the 9th IEEE On-Line Testing Symposium (IOLTS), p. 137 (2003)Google Scholar
  31. 31.
    Vijaykumar, T.N., Pomeranz, I., Cheng, K.: Transient-fault recovery using simultaneous multithreading. SIGARCH Comput. Archit. News 30(2), 87–98 (2002)CrossRefGoogle Scholar
  32. 32.
    Wang, C., Kim, H.s., Wu, Y., Ying, V.: Compiler-managed software-based redundant multi-threading for transient fault detection. In: International Symposium on Code Generation and Optimization, CGO (2007)Google Scholar
  33. 33.
    Wappler, U., Fetzer, C.: Hardware failure virtualization via software encoded processing. In: 5th IEEE International Conference on Industrial Informatics, INDIN 2007 (2007)Google Scholar
  34. 34.
    Wappler, U., Fetzer, C.: Software encoded processing: Building dependable systems with commodity hardware. In: Saglietti, F., Oster, N. (eds.) SAFECOMP 2007. LNCS, vol. 4680, pp. 356–369. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  35. 35.
    Wappler, U., Müller, M.: Software protection mechanisms for dependable systems. Design, Automation and Test in Europe, DATE 2008 (2008)Google Scholar
  36. 36.
    Wasserman, H., Blum, M.: Software reliability via run-time result-checking. J. ACM (1997)Google Scholar
  37. 37.
    Wescott, T.: PID without a PhD. Embedded Systems Programming 13(11) (2000)Google Scholar
  38. 38.
    Yeh, Y.: Triple-triple redundant 777 primary flight computer. In: Proceedings of the 1996 IEEE Aerospace Applications Conference, vol. 1, pp. 293–307 (1996)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Christof Fetzer
    • 1
  • Ute Schiffel
    • 1
  • Martin Süßkraut
    • 1
  1. 1.Department of Computer ScienceTechnische Universtät DresdenDresdenGermany

Personalised recommendations