Human Performance, Levels of Service and System Resilience
The concept of resilience has spread widely in recent years and is broadly used to examine the dynamic response of critical sectors to disruptions. Resilience is frequently associated with the ability of a system to return to normal operational conditions subsequent to a shock event. Numerous definitions of resilience have been introduced and measures of resilience developed. Yet, the existing literature shows a lack of agreement in operationalising resilience. This chapter expresses resilience in relation to systems performance and levels of service. As people at all levels of an organisation play a significant role on creating (or not) resilience, the human contribution to the resilience of critical infrastructure is discussed. Here, the four resilience cornerstones, i.e., knowing what to do, look for, expect, and has happened, help structure the discussion. This standpoint is found to support a robust operationalisation of resilience.
KeywordsHuman performance Critical infrastructure Levels of Service Resilience cornerstones
Over the last decade, the concept of resilience has developed substantially . The literature (e.g., [2, 3, 4, 5]) comprises diverse definitions, resulting in the lack of a universal understanding of the construct  and in turn its further operationalisation [6, p. 2713]. Consequently, work is still required to make the notion comprehensible and usable for the relevant stakeholders .
To this aim, this chapter focuses on the operational dimension of resilience. Its scope is threefold: first, to associate resilience with a systems levels of service; second, to investigate how this could be implemented and used by relevant stakeholders in their daily operations; and, third, to investigate the relationship and contribution of humans to resilience considering that, in order to cope with real world complexity, individuals as well as organisations constantly adjust their performance to the current conditions.
6.2 Resilience as System Behaviour and Service Levels
Resilience is described as the operational behaviour of a system subsequent to an endogenous or exogenous shock event , and it is associated with four response behaviours. The first response behaviour, namely robust, illustrates a system that can fully recover after a shock event. The second and third response behaviours, i.e., ductile and collapsing respectively, refer to a system that can either recover its basic and critical functions or collapse after a shock event. Finally, the fourth response behaviour, adaptive behaviour, represents a system that could reach a performance level higher than the original level, e.g., when the system is reconfigured during its recovery and restoration.
Optimal level of service (OpLoS): the theoretical condition for which the system was planned and designed.
Normal level of service (NLoS): The system is performing as required and expected, achieving its mission to supply the anticipated level of service, while all the systems outputs are in their normal state.
Acceptable level of service (ALoS): The systems performance is partially degraded, with one or more of the systems outputs in a disturbed mode. Still, due to the action(s) taken, i.e., contingency plan(s), the system can maintain the service quality at acceptable levels and limit its degradation.
Unacceptable level of service (ULoS): The systems performance is severely degraded and despite the action(s) taken its degradation has become unacceptable. The system is no longer able to accomplish its mission.
Out of service level (OLoS): Discontinuation of the service.
Resilience is applicable to safety-critical as well as other systems. For the former, whose loss or failure has direct implications, resilience emphasizes continued and correct operation in the wake of disruptions . For other systems, whose purpose is not a safety function but considered essential (critical) infrastructure, resilience implies continued operation as well. Naturally, if this service must be safe, it suggests continued operation and excludes operation with reduced safety levels, at least in this discussion. Nevertheless, the degradation or loss of such service may have indirect safety implications. In the case of public transport systems as discussed in this chapter, for instance, overcrowding on station platforms may have safety consequences or the resulting congestion on other transport modes may hinder emergency services as a knock-on effect.
6.3 The Human Contribution to Resilience
Woods describes  resilience as a parameter of a system that captures how well that system can adapt to handle events that challenge the boundary conditions for its operation. Such events may occur due to (i) limitations in the plans and procedures, (ii) the tendency of systems to adapt given changing pressures and expectations for performance, and (iii) environmental changes. The systems response capacity to challenging events lies partially in the expertise, strategies, and tools that people employ to respond to certain challenges .
Knowing what to do, which refers to the ability of responding to regular and irregular disruptions and disturbances by adjusting normal functioning or activating readymade responses.
Knowing what to look for, which refers to the ability of monitoring that which is or could become a threat in the near term. The monitoring shall cover both what happens in the environment, and what happens in the system itself, i.e., its own performance.
Knowing what to expect, which refers to the ability of anticipating developments and threats further into the future.
Knowing what has happened, which refers to the ability of learning from experience.
The second level, in the lower half of the figure, refers to a longer-term organisational response across the whole spectrum of an operation, including any normal and unexpected situations. It is assumed that the organisations knowledge of what to do and what to look for, on which the response to a disruption is built, is itself built upon the organisation previous experience and anticipation. Experience is derived from the organisations learning from past events, while anticipation refers to its ability to identify potential, future threats. Learning and anticipating, in other words, corresponding to the what has happened and what to expect cornerstones respectively, together form the basis for preparedness, which is transformed concretely into the plans, policies, procedures and training that are applied in the actual response to a disruption.
6.4 Resilience Operationalisation Using the Four Cornerstones and the LoS Concept
This section demonstrates the operationalisation of resilience using the four resilient cornerstones and the concept of service levels in the transportation sector. Data was collected from publicly available reports [15, 16] that describe two major disruptions of the Singaporean metro system that occurred in December 2011 within a period of two days.
The immediate cause of the stalling of the trains was damage to their Current Collector Device (CCD) “shoes” due to sagging of the “third rail”, which supplies electrical power to the trains. During both incidents sections of the third rail sagged after multiple “claws”, which hold up the third rail above the trackbed, were dislodged. With their CCDs damaged, the trains were unable to draw electricity from the third rail to power their propulsion and other systems such as cabin lighting and air-conditioning.
was initiated by a defective fastener in the Third Rail Support Assembly (TRSA), which damaged the Current Collector Device (CCD) shoes of the trains that passed the incident site. In the process, these trains destabilised the third rail system elsewhere along the network, and the forces generated by the CCD shoes of multiple trains impacting the sagging third rail caused three more claws at the incident site to be dislodged, such that the third rail came to rest on the trackbed. Thereafter, this segment of the third rail became totally impassable to all trains.
In addition, the analysis of the events prior to the disruptions in  identified numerous factors that contributed to the incidents, such as:
was triggered by one or more “rogue trains” which suffered not easily detectable CCD shoe damage when passing the 15 December 2011 incident site as the third rail was progressively sagging. In its haste to resume revenue service on 16 December 2011, the metro personnel did not conduct a sufficiently thorough investigation, such that the CCD shoe damage on the rogue train(s) went undetected. Had the investigation been thorough, the incident on 17 December 2011 might have been prevented.
Defects on train wheels that resulted in severe vibration.
Gauge fouling, or contact with the third rail system by passing trains due to the separation between the third rail and the running rail.
Design of the current third rail claw.
Shortcomings in the maintenance work culture within Singaporean Mass Rapid Transit (SMRT).
Shortcomings in the maintenance and monitoring regime, mainly in the context of ageing assets.
This example highlights a service disruption due to a combination of failures. Had the failures happened independently they would have not produce any substantial disruption on the system. Using the four resilient cornerstones, it could be claimed that all contributing factors are primarily associated with the SMRTs inability to provide its employees with the appropriate means (e.g., policies and procedures) to execute their tasks.
Regarding the levels of service, the SMRT managed to restore its service in timely manner, while also providing alternative travel options to its customers (e.g., replacement buses). In spite of the preparedness to manage the disruptions indicated by this response, the LoS in both disruptions were deemed unacceptable, as implied by the fine imposed by the Singaporean Land Transport Authority . Thus, the elasticity threshold should not be related to acceptability, while the LoS, if measured as a momentary or average capacity, is not sufficient per se to discuss service degradation. Instead, a service degradation measure needs to consider the duration (width) of the disruption and not only its magnitude (depth).
With respect to resilience cornerstones, the SMRT seemed to have learnt from the experience of the first disruption; and managed better the incident related to the second disruption. Replacement buses and alternative travel options were deployed. Considering the longer duration of the second disruption, here it appears important to determine the boundaries of a systems LoS and service degradation. Indeed, the SMRT duration of disruption was longer, yet the passengers were better served and transferred to their destinations. Hence, a broader measure of the overall performance of the system (or the measure of service degradation) considers not only the customers whose metro was not available but also completed passenger trips independent of mode, e.g. replacement services.
The SMRT example shows that organisations shall not only focus on preparing and planning how to handle with individual shock events, but also to account for potential consequential effects and their impact on the systems overall resilience. This example underscores the importance of ensuring that recovery is not only timely but also durable, i.e., placing the system into an “as good as new” state in reliability terms. The incident on the December 17 may have been prevented if the SMRT was not in haste to resume its service on December 16, subsequent to the disruption on December 15. Such haste led to the deployment of not sufficiently investigated trains and in turn to the second disruption in the system.
Resilience is broadly used to study and understand the response of critical sectors to disruptions. However, the operationalisation of the notion has not sufficiently been explored. In this chapter, resilience was presented in association with a systems performance and its degradation in terms of service levels as an undesired outcome distinct from those related to potential hazards to the public and environment. Resilience was described as the operational behaviour of a system subsequent to the occurrence of an endogenous or exogenous shock event. Further, five levels of service for a system were identified, i.e., the new normal, normal, acceptable, unacceptable and out of service level. The association between service levels and system resilience was also shown. Moreover, the acceptability of the systems response (service level trajectory) can be seen as largely unconnected with whether this response is elastic. Ultimately, the resilience of systems that deliver a service is defined not in terms of whether the system response exhibits an elastic behaviour but rather whether the service level trajectory is acceptable. Specifically, a ductile or inelastic response with a longer-term service level degradation may be acceptable; the acceptability criteria for the system will instead be based on response criteria such as the minimum service level maintained during the peak of the disruption, the magnitude of the longer-term degradation, and an overall service loss that combines the duration and magnitude of degraded service.
People at all levels of an organisation play a significant role in creating (or not) resilience. This chapter examined the human contribution to resilience, whereby the four resilience cornerstones clearly provide a helpful lens. Yet, it could be seen that the functions the cornerstones describe need to be interpreted on two levels. First, on the organizational level in terms of anticipating threats, learning from disruptions, and incorporating the lessons thereof into contingency plans and training. Second, for the frontline personnel at the “sharp end”, the functions become responding to a disruption per the procedures, monitoring whether the actions taken are successful to prevent and mitigate service degradation or recover service, and anticipating the systems evolution to enable a proactive response.
This chapter does not provide any figures of merit about a systems resilience involving the LoS or the probability/frequency and duration of the service degradation. Thus, future research will focus on evaluating different systems and their preparedness against unexpected events, while it will also identify human critical tasks and scenarios that could lead to significant losses.
Service Performance Indicators used by the Los Angeles County Metropolitan Transportation Authority, https://www.metro.net/about/metro-service-changes/service-performance-indicators/.
This work was conducted at the Future Resilient Systems at the Singapore - ETH Centre, which was established collaboratively between ETH Zurich and Singapore’s National Research Foundation (FI 370074011) under its Campus for Research Excellence and Technological Enterprise programme.
- 8.Future resilient systems. Technical report, Singapore-ETH Centre (2015)Google Scholar
- 9.B. Robert, W. Pinel, J.-Y. Pairet, B. Rey, C. Coeugnard, Y. Hmond, Organizational resilience - concepts and evaluation method (Presses Internationales Polytechnique, 2010)Google Scholar
- 10.Levels of service performance measures for the seismic resilience of three waters network delivery. Technical report, UC Quake Centre, University of Canterbury, New Zealand (2015)Google Scholar
- 11.N. Leveson, N. Dulac, D. Zipkin, J. Cutcher-Gershenfeld, J. Carroll, B. Barrett, Engineering resilience into safety-critical systems, in Resilience Engineering: Concepts and Precepts, ed. by E. Hollnagel, D.D. Woods, N. Leveson (Ashgate, Aldershot, 2012), pp. 95–124Google Scholar
- 12.D.D. Woods, Resilience engineering: Redefining the culture of safety and risk management. Hum. Factors Ergon. Soc. Bull. 49(12), 1–3 (2006)Google Scholar
- 13.S.W. Dekker, E. Hollnagel, D.D. Woods, R. Cook, Resilience engineering: new directions for measuring and maintaining safety in complex systems. Technical report, Lund University School of Aviation (2008)Google Scholar
- 14.E. Hollnagel, Epilogue: RAG - the resilience analysis grid, in Resilience Engineering in Practice: A Guidebook, ed. by E. Hollnagel, J. Paris, D.D. Woods, J. Wreathall (Ashgate, Aldershot, 2011)Google Scholar
- 15.Singapore Ministry of Transport, Report of the committee of inquiry into the disruption of MRT train services on 15 and 17 Dec 2011 (2012)Google Scholar
- 16.Singapore Land Transport Authority, SMRT to be fined \$2 million for december 2011 train service disruptions along the north south line (2012)Google Scholar
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.