Skip to main content

Murphy Was an Optimist

  • Conference paper
Computer Safety, Reliability, and Security (SAFECOMP 2010)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 6351))

Included in the following conference series:

  • 2108 Accesses

Abstract

Embedded, safety-critical systems often have requirements for incredibly small probabilities of failure, e.g. 10-9 for a one hour exposure.  One often hears designers of safety-critical systems say: "We have to tolerate all credible faults".

However, the word "credible" in this assertion contrasts starkly with the word "incredibly" in the sentence before. In fact, there are faults and failures that most designers think can’t happen which actually can and do happen with probabilities far greater than the requirements allow.  The well known Murphy’s Law states that: "If anything can go wrong, it will go wrong."  When requirements limit failure probabilities to one-in-a-million or less, this should be re-written as:  "If anything can’t go wrong, it will go wrong anyway."

There are a couple of factors that lead to designers erroneously thinking that certain faults and failures are impossible; when in fact, not only are they possible, but some are actually highly probable.

One factor is that the requirements are outside any designer’s experience, even when that experience includes that of colleagues.  Using the literature seems like an obvious way of expanding one’s (virtual) experience.  However, there are two problems with this.  The first problem is that people who actually design safety-critical systems are rarely given enough time to keep current with the literature.  The second problem is that the literature on actual occurrences of rare failure modes is almost nonexistent.  Reasons for this include:  people and organizations don’t want to admit they had a failure; designers feel that rare failure occurrence aren’t worth reporting; and, if designers aren’t given enough time to read literature, they certainly aren’t given enough time to write it.  Take away:  Designers should fight their management for time to keep current with the literature and designers should use every report of a rare failure as an opportunity to imagine other similar modes of failure.

The other factor that leads to designers erroneously thinking that certain faults and failures are impossible stems from abstraction.  The complexity of modern safety critical systems requires some form of attraction.  However, when designers limit their thinking to one level of extraction, certain faults and failures can seem impossible, but would clearly be seen as probable if one were to examine layers below that level of abstraction.  For example, a designer thinking about electrical components would not include in their FMEA the possibility that one component (e.g. a diode) could transmogrify into another component (e.g. a capacitor).  But, at a lower level of extraction, it can be seen that a crack through a diode die can create a capacitor.  And, a crack is one of the most highly probable failure modes at the physical material level of obstruction.

Examples of rare but actually occurring failures will be given.  These will include a number of Byzantine faults, component transmogrification, fault mode transformation (e.g. stuck at faults that aren’t so stuck), the dangers of self-inflicted shrapnel, component creation via emergent properties, "evaporating" software, and exhaustively tested software that still failed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Driscoll, K.R. (2010). Murphy Was an Optimist. In: Schoitsch, E. (eds) Computer Safety, Reliability, and Security. SAFECOMP 2010. Lecture Notes in Computer Science, vol 6351. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15651-9_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15651-9_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15650-2

  • Online ISBN: 978-3-642-15651-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics