Engineering Analysis of Failure: A Determination of Cause Method
- 1.1k Downloads
The purpose of this article is to present an engineering method for the determination of cause by the identification of defects that lead to failure. Further discussion on this topic is of course warranted and this article is anticipated to catalyze such discussions. In order to facilitate the development and presentation of this cause determination method, the definition of defect as it relates to failure is first presented. While not all failures are the result of defects (and not all defects result in failures), identification of a defect may point to opportunities to prevent recurrence and assist in the determination of cause. Furthermore, use of this method also serves to identify causes of failure that are not attributable just to the actions of responsible parties: such as wear and tear, acts of nature, and the unknown. Once the cause is identified, the resolution, recovery, and recurrence prevention process then has the opportunity to move forward. To further demonstrate application of the cause determination method presented here, case studies of failures are provided.
KeywordsFailure analysis Cause determination Defect identification Hazard Risk Controlled hazard
When consideration is given to the cause of a failure (with consequent loss in the form of property damage or personal injury), two primary aspects are at issue; reasonable preventability of the failure, and responsibility. A related aspect to preventability is whether or not information that comes to light as a result of analyzing the failure will make prevention of recurrence of the failure mode practical: That is, a determination of reasonable means to prevent such failure.
The approach of this article is to apply the perspective of the engineer to the question of cause determination. As such this article is intended to facilitate the work of engineers in fulfilling their primary responsibility to “Hold paramount the safety, health, and welfare of the public .”
A determination that a failure was preventable by reasonable means implies that some lack of reasonableness was present prior to the failure. However, it is important to note that reasonable means available after a failure has taken place may not have been available before the failure took place. That is, the occurrence of a failure may of itself bring to light unanticipated information that now allows reasonable preventative steps to be taken.
While much more can be said on this topic, the purpose of this article is to present an engineering method for the determination of cause by the identification of defects that lead to failure. The work of Charles O. Smith on product liability and design, published in the ASM Handbook 11 Failure Analysis and Prevention, is of particular pertinence to this discussion . Further discussion on this topic is of course warranted and this article is anticipated to catalyze such discussions. In order to facilitate the development and presentation of this cause determination method, the definition of defect as it relates to failure is first presented. While not all failures are the result of defects (and not all defects result in failures), identification of a defect may point to opportunities to prevent recurrence and assist in the determination of cause. Furthermore, use of this method also serves to identify causes of failure that are not attributable just to the actions of responsible parties: such as wear and tear, acts of nature, and the unknown. Once the cause is identified, the resolution, recovery, and recurrence prevention process then has the opportunity to move forward. To further demonstrate application of the cause determination method presented here, case studies of failures are provided.
Definition of Defect
Hazard a condition or situation that can result in property damage, personal injury, or death.
Risk the probability that a hazard will become manifest.
Controlled Hazard a hazard for which all reasonable steps have been taken to minimize the risk associated with the hazard (and for which no unreasonable steps have been taken that increase the risk associated with the hazard).
Defect an uncontrolled hazard that is a lack of reasonable steps or a presence of unreasonable steps.
Determination of Cause Method
Wear and tear.
Actions of a person, persons, entity, or entities.
Act of natural world (sometimes referred to as an act of God).
Referencing Fig. 1, Decision Point D1 poses the question “Is the loss/damage/injury consistent with reasonable care and use over a period of time of the involved object(s)?” A yes answer to this question will lead to Cause C1: wear and tear. An example that would fall into this category is automobile tires that exhibit uniform wear consistent with use of the tires. Another example would be weathering to the exterior of a building that is consistent with the age of the building. In both instances, signs of abuse would not be observed and indication of proper maintenance would be confirmed. A no answer to the question then leads to Decision Point D2.
Decision Point D2 poses the question “Are the hazards that resulted in the loss/damage/injury able to be identified?” A no answer at this point will lead to Cause C2: unknown. An example that would fall into this category might be a piece of electronic equipment that cannot be tested since it is not functioning and where the history of the equipment is not known. A yes answer then leads to the Decision Point D3.
Decision Point D3 poses a compound question - the central point of which is to assess whether or not the actions leading up to the failure were reasonable. Lack of reasonableness may be manifest either in the absence of a reasonable act or in the presence of an unreasonable act. To wit: were reasonable steps taken to minimize the risk due to the identified hazard(s) and were there no unreasonable steps taken that would have increased the risk from the hazard? A no answer asserts an identifiable unreasonableness and will lead to Cause C3a, a defect due to actions of a person or entity.1 An example of a lack of reasonable steps would be a contractor who dug a hole at a construction site but did not put up barriers or markers to prevent someone from falling into the hole. An example of the presence of unreasonable steps would be a modification to a machine or process operating system that exposes operating personnel to injury. A yes answer leads to Decision Point D4.
Decision Point D4 (and also Decision Point D2 discussed earlier) recognizes that, unsatisfying as it may be, there are times when the information available is insufficient to identify a cause and therefore leads us to Cause C2: unknown. Decision Point D4 poses the question “Was the loss due to identified hazards that exceeded the control provided by reasonable steps taken?” Consider a building damaged by a fire in which the fire damage is great enough that the pre-fire conditions could not be established, the origin of the fire could not be determined, and for which the pre-fire history of the building is not known. Whether or not the controls were exceeded cannot be assessed and a no answer is required. Another example would be a code-compliant building that has sustained wind damage from winds that were less than the design and construction of the building should have allowed the building to withstand.2 No deficiencies of materials, workmanship, or design are identified. In this case, adequate controls for the identified hazard (wind) were in place, but damage was sustained nevertheless. Either there was another unknown hazard (different from the identified wind load hazard) that resulted in the loss or some aspect of the history of the building created a deficiency that is not able to be identified. Regardless, there would be a no answer which leads to Cause C2: unknown. Another example leading to a no answer here would be a fractured part where design, choice of materials, and installation are known and confirmed, but where the service use history is not known.
Further consideration of Decision Point D4 leads to the alternative in which the controls for the identified hazard are known to have been exceeded. A yes answer would lead to Decision Point D5. An example here would be a 500-year flood (hazard) that ruptured (damaged) a dam built to withstand a 100-year flood. Another example would be a consumer product that was manufactured according to the best knowledge and practices at the time of manufacture that later injured a user due to some previously unrecognized hazard. The protections provided by the control were exceeded by the hazard.
Recall that in order to reach this point in the analysis, it has been established that the actions leading up to the failure were reasonable and that the controls that the reasonable actions put in place were exceeded by the hazard. Further, note that the example hazards presented in consideration of Decision Point D4 included a hazard of natural origin (the 500-year flood) and a hazard of human origin (injury from a manufactured product). Decision Point D5 seeks to differentiate between human-created hazards and natural hazards and, therefore, poses the question “Was the hazard that resulted in the damage due to a human-created hazard?” A no answer will lead to Cause C4: Act of the Natural World. A yes answer leads to Decision Point D6.
Arriving at this point in the analysis, if the hazard is now recognizable due to information brought to light by the failure, then a judgment may be made as to the reasonableness of the hazard. Given the information that exists due to the fact that the failure has taken place, Decision Point D6 then poses the question “Were the conditions prior to the loss unreasonably hazardous?” A yes answer will lead to cause C3a: a defect due to actions of a person or entity. In a case such as this, even though due care was exercised and appropriate reasonable steps were taken to prevent loss from the hazard (based upon the information and practices available at the time), the loss still occurred as a result of a hazard that was not controlled. Further a person or entity was involved.
This example defines a special case of assigning responsibility (i.e., assigning cause) that was ultimately defined into case law in the USA. The legal term that applies is “strict liability ,” a concept that needs to be recognized and appreciated by the engineer. Referencing step D2 in Fig. 1, the hazard that resulted in the loss could not have been known in advance. The answer at this step is no. However, under the theory of strict liability, the cause, rather than unknown, is attributed to a person or an entity under Cause C3a: defect due to action of person or entity. Two examples here will suffice to explain. A first example would be a company creates a product which after the fact is identified as defective (without a prior knowledge of the hazardous condition). The product later results in a loss or injury. Only then is the product identified as defective and only then is there recognition of steps that could have been taken to control the hazardous condition of the product. A product, therefore, can be deemed defective even though the manufacturer had taken reasonable care in its production and there is no record of abuse. A second example would be a premature service-related fracture of an axle on machine part or on a vehicle. The design and manufacturing record confirms that all reasonable steps were taken to avoid premature fracture. The vehicle was not misused. Analysis determines that the fracture was due to a latent manufacturing deficiency in the material, present in spite of the record of design, manufacture, and care.
Alternatively, if the utility of an object is inseparable from a hazard and a benefit is derived from use of the object, then the object has an inherent hazard that is reasonable and the risk from the inherent hazard is borne by a person or entity. The answer to the question at Decision Point D6 is no. While in this instance there is no defect, because there is a benefit derived from an inherent hazard the risk is borne by person or entity which leads to Cause C3b: actions of a person or entity. Objects whose basic function is not separable from some hazardous feature would be deemed reasonably hazardous. A sharp knife would be a recognizable example that falls into this category. Flammable fuel for automobiles would be another.
Thoughts as to Putting the Method into Practice
Use of this method often leads to a well-defined single cause for a failure. However, in some instances the cause of a failure is attributable to a combination of underlying factors. That is, there may be more than one cause. For example, a product which is unreasonably hazardous may be used by someone in a way that is also unreasonable - the combination of which conditions results in an injury. Elimination of either the unreasonable hazard or the unreasonable use would have prevented the failure. Thus, it is incumbent upon the engineer in conducting the analysis to determine whether multiple factors were present.
Case Study 1: Rupture of a Pressure Vessel: Improper Maintenance
This incident concerns the rupture of a steam accumulator that was part of a steam-generating facility with consequent damage to a facility . On a normal working day while being operated at its typical working pressure of 120 psi, the pressurized vessel ruptured without warning. The weld that connected the bottom head section of the pressure vessel to the main shell section had fractured separating the bottom head of the vessel from the shell. Manufacture of this pressure vessel falls under Section VIII, Division 1 of the ASME Boiler and Pressure Vessel Code. It had been designed for a maximum working pressure of 150 psi. An appropriate steel had been used for the manufacture of the pressure vessel. The code specifies that a hydrostatic test pressure of 225 psi be applied to the pressure vessel at the time of manufacture.
A review of the history of the pressure vessel revealed a drain coupling on the bottom of the pressure vessel had been repaired a few years prior to the incident. Repairs were carried out in accordance with the National Board Inspection Code (the NBIC), with a hydrostatic pressure test at 120 psi.
Eighteen months prior to the incident, a small leak was detected at a crack in the weld that connected the bottom head to the shell of the pressure vessel. Repair work was undertaken by welding the crack. The vessel was not subjected to a hydrostatic test prior to being placed back into service.
Examination of the fracture surfaces subsequent to the incident revealed that a crack (present since the time of original manufacture) had grown due to cyclic loading while the pressure vessel was in service, resulting in the rupture. The precise size of the crack at the time of manufacture could not be determined. However, the pressure vessel did pass the ASME code required hydrostatic testing conducted at the time of manufacture.
The leak that precipitated the repair work was determined to be from a crack in the weld that connected the bottom head to the shell of the pressure vessel that had been present from the time of original manufacture. The size of the crack, however, was not of such an extent that precluded passing hydrostatic tests both at original manufacture and after replacement of the drain coupling. Returning now the repair of the crack location where the vessel was found to be leaking. Repair of such cracks is governed by the NBIC which requires the crack be removed as a part of the repair process. Post-loss examination demonstrated that, in fact, the crack was not removed as a part of the repair. The requirements of the NBIC were not fulfilled. As the pressure vessel continued in service, the crack continued to grow.
The crack that had manifested as a leak 18 months prior to the rupture, consistent with the “leak before rupture (or break)” design philosophy appropriate for pressure vessels . The severity of this earlier condition was modest (a release of steam with limited potential for injury or additional property damage). Repair of this crack created conditions that resulted in a “rupture before leak” with consequent greater severity (property damage and personal injury).
Case Study 2: Vehicle Fires: Reasonable Steps Taken at Time of Manufacture: Hazard Not Recognized Until After the Hazard Had Become Manifest
Fire science teaches that in order for a fire to start three things must be brought together in the right combination to enable an uninhibited chemical reaction: a competent ignition source, an ignitable fuel, and oxygen (or an oxidizer). Unintended fires are prevented by keeping this combination from being brought together. In the context of the design and manufacture of internal-combustion engine powered vehicles, ignition sources, ignitable fuels, and oxygen are present throughout the vehicle. The prevention of fires is accomplished by preventing an ignitable combination from coming together.
Wiring faults in the engine compartment.
Wiring faults in the passenger compartment.
Friction from locked-up accessory drive pulleys.
Wheel bearing deprived of required lubrication.
Partial list of ignitable materials includes:
Fuel (gasoline, diesel, propane, natural gas).
Miscellaneous plastic components within the engine compartment.
Upholstery in the passenger compartment (these are often fire resistant but not fire proof – i.e., they will burn if flame is supported by another fuel).
Foreign objects/debris that has accumulated in the vehicle (grass, objects from the road, animals impacted while in motion or building nests, etc.)
After conducting investigations of fires that took place in these vehicles, the manufacturer determined that a different style of clamp, with no raised area that could rub on the adjoining diesel fuel line, could be utilized. However, prior to what was now an appreciation of the hazardous condition presented by the original diesel fuel hose clamp, the step of utilizing the alternative clamp was not recognized. Once the hazard of a fire in the vehicles as a result of the use of the original clamp was recognized, it was then realized that a reasonable means was available to minimize the risk of the fire hazard.
A determination of cause method, intended to assist the engineer in the investigation of failure, has been presented along with two example cases. Further, a definition of defect has been proposed to facilitate the investigation of failure. The engineering judgment in the application of the method and the definition of defect hinge upon a qualified assessment of what is reasonable and what is unreasonable. Reasonableness is the core of the work of the engineers in their efforts to protect the public. As noted at the onset of this paper, the results of the engineer’s analysis are useful in addressing the question of loss resolution, recovery, and compensation.
The “person or entity” cause is separated into two categories to acknowledge the fact that a person or entity can be responsible for a loss incident even when a defect does not exist.
In this example the engineer will need to consider whether the code requirements are performance driven or proscriptive.
- 1.Code of Ethics for Engineers, National Society of Professional Engineers, As Revised July 2007Google Scholar
- 2.ASM Handbook, Failure Analysis and Prevention, vol. 11, pp. 71–78Google Scholar
- 3.Merriam Webster’s Collegiate Dictionary, p. 302, Tenth editionGoogle Scholar
- 4.Black’s Law Dictionary, p. 507, Tenth editionGoogle Scholar
- 5.ASM Materials Engineering Dictionary, page 111Google Scholar
- 6.ASM Glossary, p. 1065Google Scholar
- 7.ASM Handbook, Failure Analysis and Prevention, vol. 11, pp. 72–73Google Scholar
- 8.A Concise Restatement of Torts, pp. 276–331, 2nd edn. (2010)Google Scholar
- 10.Š. Pacholková, H. Taylor, Theoretical background of “Leak-before-break” as a concept in pressure vessels design. Metal 5, 1–8 (2002)Google Scholar