Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Task analysis and modelling approaches have always focused on the explicit representation of standard behavior of users, leaving user error analysis for later phases in the design processes [2]. This is part of the rationale underlying task analysis which is to provide an exhaustive analysis of user behavior describing goals and activities to reach these goals. Clearly, errors, mistakes and deviations are not part of the users’ goals and thus left aside of tasks descriptions. This exhaustive aspect of task analysis is fundamental as it is meant to provide the basics for a global understanding of users behaviors which will serve as a basis for driving evolutions of the interactive system. However, practice (for real-life applications) shows that reaching this comprehensiveness is very hard, especially as it require a vast amount of resources. If cuts have to be made when analyzing standard activities, it is clear that infrequent or abnormal behaviors are often not considered. However, this is precisely where the emphasis should be placed in order to deal efficiently with error tolerance as error prone systems deeply impact efficiency and satisfaction. Beyond these usability-related aspects, in critical systems the cost of an operator error might put people life at stake, and this is the reason why Human Reliability Assessment (HRA) methods (such as HET, CREAM or HERT) provide means for identifying human errors. Such approaches go beyond early work of Norman on typologies of human errors [20] which have then been integrated in the action theory [21]. Indeed, they are usually associated with tasks descriptions in order to relate work and goals with erroneous behaviors of operators. However, they all exploits basic task description techniques making impossible to go beyond qualitative and quantitative temporal descriptions.

In this paper we propose the use of a detailed task description technique called HAMSTERS [18] within a HRA method to support identification of errors related to information, knowledge and devices. Beyond that, we present extensions to HAMSTERS notation in order to describe identified error within the task models. Integrating errors within a task model brings multiple advantages, the most prominent being the seamless representation of activities to reach goals and possible deviations. Such integrated representation can be exploited for building effective and error avoidant interactive systems.

The paper is structured as follows. Section 2 presents the human error domain, human reliability assessment methods and task modeling. This state of the art is used to identify limitations of current HRA methods and to identify requirements for extending task models to encompass information dedicated to user errors. Section 3 presents an extended version of the HAMSTERS notation in which genotypes and phenotypes of errors enrich “standard” task models. This section also proposes a stepwise process based on Human Error Template (HET) [33] HRA method to systematically identify user errors and to represent them in task models. Section 4 shows, on a case study, how this framework can be used and what it brings to the design and verification of error-tolerant safety critical interactive systems. Section 5 highlights benefits and limitations of the approach while Sect. 6 concludes the paper and presents future work.

2 Related Work on Human Error and Task Modelling

Human error has received a lot of attention over the years and this section aims at presenting the main concepts related to human errors as well as the existing approaches for analyzing them. This related work section starts with the analysis of taxonomies of human errors followed by processes and methods for identifying human errors in socio-technical systems. Last sub-section summarizes work on representing human errors with a specific focus on representations based on task description.

2.1 Definition and Taxonomies of Human Errors

Several contributions in the human factors domain deal with studying internal human processes that may lead to actions that can be perceived as erroneous from an external view point. In the 1970s, Norman, Rasmussen and Reason have proposed theoretical frameworks to analyze human error. Norman, proposed a predictive model for errors [20], where the concept of “slip” is highlighted and causes of error are rooted in improper activation of patterns of action. Rasmussen proposes a model of human performance which distinguishes three levels: skills, rules and knowledge (SRK model) [28]. This model provides support for reasoning about possible human errors and has been used to classify error types. Reason [30] takes advantages of the contributions of Norman and Rasmussen, and distinguishes three main categories of errors:

  1. 1.

    Skill-based errors are related to the skill level of performance in SRK. These errors can be of one of the 2 following types: (a) Slip, or routine error, which is defined as a mismatch between an intention and an action [20]; (b) Lapse which is defined as a memory failure that prevents from executing an intended action.

  2. 2.

    Rule-based mistakes are related to the rule level of performance in SRK and are defined as the application of an inappropriate rule or procedure.

  3. 3.

    Knowledge-based errors are related to the knowledge level in SRK and are defined as an inappropriate usage of knowledge, or a lack of knowledge or corrupted knowledge preventing from correctly executing a task.

At the same time, Reason proposed a model of human performance called GEMS [30] (Generic Error Modelling System), which is also based on the SRK model and dedicated to the representation of human error mechanisms. GEMS is a conceptual framework that embeds a detailed description of the potential causes for each error types above. These causes are related to various models of human performance. For example, a perceptual confusion error in GEMS is related to the perceptual processor of the Human Processor model [5]. GEMS is very detailed in terms of description and vocabulary (e.g. strong habit intrusion, capture errors, overshooting a stop rule …) and structuring approaches have been proposed as the Human Error Reference Table (HERT) in [22].

Causes of errors and their observation are different concepts that should be separated when analyzing user errors. To do so, Hollnagel [9] proposed a terminology based on 2 main concepts: phenotype and genotype. The phenotype of an error is defined as the erroneous action that can be observed. The genotype of the error is defined as the characteristics of the operator that may contribute to the occurrence of an erroneous action.

These concepts and the classifications above provide support for reasoning about human errors and have been widely used to develop approaches to design and evaluate interactive systems [31]. As pointed out in [23] investigating the association between a phenotype and its potential genotypes is very difficult but is an important step in order to assess the error-proneness of an interactive system. This is why most of the approaches for Human Reliability Assessment focus on this double objective, as presented in next section.

2.2 Techniques and Methods for Identifying Human Errors

Many techniques have been proposed for identifying which human errors may occur in a particular context and what could be their consequences in this given context. Several human reliability assessment techniques such as CREAM [10], HEART [39], and THERP [34] are based on task analysis. They provide support to assess the possibility of occurrence of human errors by structuring the analysis around task descriptions. Beyond these commonalities, THERP technique provides support for assessing the probability of occurrence of human errors. Table 1 presents an overview on the existing techniques for identifying potential human errors. For each technique, the following information is highlighted:

Table 1. Summary of techniques and methods used for identifying human errors
  • Type of technique: to indicate to which scientific domain this technique is related. Values can be HEI (Human Error Identification), DC (Dependable Computing), SA (Safety Analysis) …

  • Associated task modelling technique: to indicate how the user tasks are described once the task analysis has been performed. Most of them exploit HTA (Hierarchical Task Analysis notation) [1];

  • Tool support for task analysis and modelling: to indicate whether or not a particular Computer Aided Software Environment (CASE) tool is available to provide support for the application of the technique;

  • Associated error classification: to indicate which human error classification is used to identify possible errors. ‘G’ and ‘S’ indicates whether the classification comes from a generic system failures analysis or whether it is specific to human errors;

  • Capacity to deal with combination of errors: to indicate whether or not the techniques provides explicit support for identifying possible combinations of errors. Here only 2 values are possible: ‘No’ and ‘NE’ (Not Explicitly meaning that the method was not claiming explicitly that combinations of errors are handled).

For all the techniques presented above the process of identifying possible human errors highly relies on the user tasks descriptions. The task descriptions have to be precise, complete and representative of the user activities, in order to be able to identify all the possible errors. Indeed, the task description language as well as the mean to produce the description affect the quality of the analysis. However, most of them exploit Hierarchical Task Analysis (HTA) which only provides support for decomposing user goals into tasks and subtasks and for describing the sequential relationships between these tasks (in a separate textual representation called “plan”). As HTA does not provide support for describing precisely the types of user actions, the temporal ordering types that are different from a sequence of actions (such as concurrent actions, order independent actions…), as well as information and knowledge required to perform an action, errors related to these elements cannot be identified. Furthermore, as most of these techniques do not have tool support it is cumbersome to check coverage of and to store identified errors in a systematic way. For example, as HTA does not provide support for describing knowledge required to perform a task, none of these methods provide explicit support for the identification of all possible knowledge-based mistakes.

2.3 Support for Representation of Human Errors in Task Model

As explained above the expressive power of the task modelling notation has a direct impact on how task models produced with these notation are likely to support the identification of errors. Many task modelling notations have been proposed over the years focusing on the representation of standard user behaviors most of the time leaving aside erroneous behaviors.

Table 2 presents a comparison of task modelling notations to assess (depending on their expressive power) their capability in identifying and representing human errors. For each notation, the following information is highlighted:

Table 2. Support for describing errors and errors-related elements
  • Identification of human error: to indicate whether or not the notation provides support to systematically establish a relationship between a task model element and a component of a model of human information processing or model of human performance.

  • Explicit representation of human error: to indicate whether or not the notation provides support to systematically represent human error related information in a task model.

  • Explicit representation of error recovery: to indicate whether or not the notation provides support to explicitly represent recovery tasks i.e. when an error has occurred, to describe the set of actions to be performed in order to still reach the goal. While this is possible in most task modelling notations (e.g. set of action to perform after entering a wrong PIN when using a cash machine) we identify here the fact that the notation makes explicit (or not) that this set of task is related to a user error.

Even though the content in Table 2 demonstrates the very limited account of error handling in task modeling notation, task models have already been used to take into account possible human errors while interacting with an interactive system. Paterno and Santoro proposed a model-based technique that uses insertion of deviated human actions into task models in order to evaluate the usability of the system and to inform design [25], however, such information is presented in tables outside of the task models. This approach is relevant for human error identification but only in generic terms (as it exploits HAZOP which is a standard hazard analysis method). Palanque and Basnyat proposed a technique based on task patterns (represented in CTT) that supports human routine errors [22] description. Here a specific task model is produced in which recovery actions following errors are explicitly represented, thus ending up with two un-connected task model. Modification in one of the task model has then to be reflected in the other one increasing complexity of task modelling activities. In both contributions, no specific element of the notation are introduced thus leaving the contributions to basic task elements provided in CTT notation (and thus not covering errors related to information, knowledge … as presented above).

In order to overcome the limitations of the current task modelling notations, next section presents extensions to the HAMSTERS notation to specifically represent errors. While the extensions are made explicit on that particular task modelling technique, the underlying concepts are generic making them applicable to others.

3 Extending a Task Modelling Notation to Support the Identification and Representation of Human Errors

This section presents the extensions that have been added to the HAMSTERS notation in order to provide support for systematic identification and representation of human errors in task models. We also present how this extended notation has been integrated within a human error identification technique. This process starts with an extant task model and extends it with explicit genotypes and phenotypes of errors.

3.1 HAMSTERS Notation

HAMSTERS (Human – centered Assessment and Modeling to Support Task Engineering for Resilient Systems) is a tool-supported graphical task modeling notation for representing human activities in a hierarchical and structured way. At the higher abstraction level, goals can be decomposed into sub-goals, which can in turn be decomposed into activities. Output of this decomposition is a graphical tree of nodes that can be tasks or temporal operators. Tasks can be of several types (depicted in Table 3) and contain information such as a name, information details, and criticality level. Only the single user high-level task types are presented here but they can be further refined. For instance the cognitive tasks can be refined in Analysis and Decision tasks [19] and collaborative activities can be refined in several task types [16].

Table 3. Task types in HAMSTERS

Temporal operators (depicted in Table 4 and similar to the ones in CTT) are used to represent temporal relationships between sub-goals and between activities. Tasks can also be tagged by properties to indicate whether or not they are iterative, optional or both. The HAMSTERS notation is supported by a CASE tool for edition and simulation of models. This tool has been introduced in order to provide support for task system integration at the tool level [16]. This tool supported notation also provides support for structuring a large number and complex set of tasks introducing the mechanism of subroutines [19], sub-models and components [8]. Such structuring mechanisms allow describing large and complex activities by means of task models. These structuring mechanisms enables the breakdown of a task model in several ones that can be reused in the same or different task models.

Table 4. Illustration of the operator type within hamsters

HAMSTERS expressive power goes beyond most other task modeling notations particularly by providing detailed means for describing data that is required and manipulated [16] in order to accomplish tasks. Figure 1 summarizes the notation elements to represent data. Information (“Inf:” followed by a text box) may be required for execution of a system task, but it also may be required by the user to accomplish a task. Physical objects required for performing a task can also be represented (“Phy O”) as well as the device (input and/or output) with which the task is performed (“i/o D”). Declarative and situational knowledge can also be made explicit by the “SiK” and “StK” elements.

Fig. 1.
figure 1

Representation of objects, information and knowledge with HAMSTERS notation

3.2 HAMSTERS Notation Elements and Relationship with Genotypes

All of the above notation elements are required to be able to systematically identify and represent human errors within task models. Indeed, some genotypes (i.e. causes of human errors) can only occur with a specific type of task or with a specific element in a task model described using HAMSTERS. This relationship between classification of genotypes in human error models and task modelling elements is not trivial. For this reason, Table 5 presents the correspondences between HAMSTERS notation elements and error genotypes from the GEMS classification [29]. Such a correspondence is very useful for identifying potential genotypes on an extant task model.

Table 5. Correspondence between HAMSTERS elements and genotypes from GEMS [29]

It is important to note that strategic and situational knowledge elements are not present in this table. Indeed, such constructs are similar to the M (Methods) in GOMS and thus correspond to different ways of reaching a goal. As all the methods allow users to reach the goal an error cannot be made at that level and is thus not connected to a genotype.

3.3 Extensions to HAMSTERS to Describe User Errors

Several notation elements have been added to HAMSTERS in order to allow explicit representation of both genotypes and phenotypes of errors. Table 6 summarizes these notation elements that can be used to describe an observable consequence of an error (phenotype) and its potential associated causes (genotypes).

Table 6. Representation of genotypes and phenotypes in HAMSTERS

In that table the first column lists the types of errors following GEMS classification. The second column makes the connection with the SRK classification as previously performed in [29]. Third column present the new notation elements in HAMSTERS for describing genotypes of errors as well as how they relate to the classifications on human error. Four new elements are added: Slips, Lapses, Rule-Based Mistakes and Knowledge-Based Mistakes. As for phenotypes only one notation element is proposed. Indeed, the phenotype (i.e. how the errors is made visible) only need to be explicitly represented, the label beneath it providing a textual description while its relationship to the causes is made by connecting genotypes to it. Such connections will be presented in details in the case study section.

3.4 Modelling Process

In this section, we show how we have integrated HAMSTERS extended notation with the HET [33] technique. HAMSTERS could be used to replace HTA in any other human error identification method based on task description, but we have chosen HET because it provides a detailed process and because it has been demonstrated in [33] to be more accurate than other techniques such as SHERPA and HAZOP [15].

Figure 2 presents a modified version of the HET process and provides support for identifying genotypes and phenotypes of possible human errors by embedding error descriptions in the task models that have been produced to describe user activities. The extended process starts with a task analysis and description phase (as for the original HET one), but in our case the produced task models are refined to represent perceptive, cognitive and motor user tasks as well as information and knowledge required to perform the tasks. These models take full advantage of the expressive power of HAMSTERS that has been presented in Sect. 3. All the modifications made with respect to the original process have been made explicit by using various shades of grey.

Fig. 2.
figure 2

Human error identification and description process extended from HET [33]

Next step in the process exploits the task type–genotypes correspondence table (Table 5), to provide support for systematic identification of genotypes associated to perceptive, cognitive, motor and interactive input tasks, but also to the related phenotypes. The likelihood and criticality of a genotype are inserted as properties of the instance of represented genotype. This is performed in HAMSTERS tool by specific properties associated to the genotypes icons. Similarly, likelihood and criticality of a phenotype can also be described using properties of the instance of a represented phenotype. Likelihood of a phenotype may be a combination of likelihood of related genotypes. Once all of the possible genotypes and phenotypes have been identified and described in the task model, the human error identification and representation technique is applied to the next task model. Once all of the models have been analyzed, a last step is performed (see bottom left activity in Fig. 2) in order to determine, for each task model that embeds human error descriptions, which phenotypes may be propagated to other task models. Several phenotypes may be associated to an observable task, but not all of them may happen in a particular scenario.

4 Illustrative Example from an Avionics Case Study

This section presents an excerpt of task models produced by the application of the process presented above for identification and representation to a case study. The case study belongs to the aeronautics domain and more precisely deals with pilot tasks exploiting a weather radar cockpit application. This section aims at illustrating how the HAMSTERS extensions can be applied to human operations on a real-life application. Due to space constraints, the application of all new elements of notation are not shown in this article but most of them are.

4.1 Presentation of the Weather Radar Case Study

Weather radar (WXR) is an application currently deployed in many cockpits of commercial aircrafts. It provides support to pilots’ activities by increasing their awareness of meteorological phenomena during the flight journey, allowing them to determine if they may have to request a trajectory change, in order to avoid adverse weather conditions such as storms or precipitations. In this case study, we particularly focus on the tasks that have to be performed by a pilot to check the weather conditions on the current flight path.

Figure 3 presents a screenshot of the weather radar control panels, used to operate the weather radar application. These panels provides two functionalities to the crew. The first one is dedicated to the mode selection of weather radar and provides information about status of the radar, in order to ensure that the weather radar can be set up correctly. The operation of changing from one mode to another can be performed in the upper part of the panel (mode selection section).

Fig. 3.
figure 3

Image of (a) the numeric part of weather radar control panel (b) physical manipulation of the range of the weather radar

The second functionality, available in the lower part of the window, is dedicated to the adjustment of the weather radar orientation (Tilt angle). This can be done in an automatic way or manually (Auto/manual buttons). Additionally, a stabilization function aims to keep the radar beam stable even in case of turbulences. The right-hand part of Fig. 3 (labelled “(b)”) presents an image of the controls used to configure radar display, particularly to set up the range scale (right-hand side knob with ranges 20, 40, … nautical miles).

Figure 4 shows screenshots of weather radar displays according to two different range scales (40 NM for the left display and 80 NM for the right display). Spots in the middle of the images show the current position, importance and size of the clouds. Depending on the color of the clouds in the navigation display (Fig. 4), pilots can determine whether or not the content of the clouds is dangerous for the aircraft. For example, the red color highlights the fact that the clouds contain heavy precipitations. Such information is needed in order to ensure that the current or targeted flight plan are safe.

Fig. 4.
figure 4

Screenshots of weather radar displays (Color figure online)

4.2 Task Model of the Task “Check Weather Conditions on the Flight Path”

Figure 5 presents the description, with HAMSTERS elements of notation, of the activities that have to be performed to check the weather conditions on the flight path.

Fig. 5.
figure 5

Task model of the “Check weather conditions on the flight path” task (Color figure online)

The tasks presented in this model describe how the pilot builds a mental model of the current weather from information gathered on the navigation display (Fig. 4). For a pilot, checking weather conditions is very important as it provides support for deciding to maintain or change the current trajectory of the aircraft. This task is decomposed into 3 sub tasks:

  • “Examine Map”: the pilot perceives and examines the radar image of the weather, which is displayed on the navigation display (see Fig. 4). To perform this analysis, the pilot has to know the meaning of the weather representations (described with declarative knowledge notation elements in Fig. 5 such as “Green light clouds mean precipitation”).

  • “Manage WXR control panel”: This sub task is represented by a subroutine, and linked to another task model, which describes the tasks that have to be performed to control the WXR modes.

  • “Manage Display Range”: This sub task describes the actions that have to be performed by the pilot in order to change the range of the WXR display with using the physical knob “range” (illustrated in Fig. 3b). The pilot has to turn the knob to modify the range, and then to wait for the radar image to be refreshed on the navigation display (Fig. 4).

4.3 Task Model with Human Errors

Figure 6 presents a modified version of the “Check weather conditions on the flight path” task model. This new version embeds the descriptions of possible human errors (genotype and phenotypes) which have been identified while applying the human error identification process.

Fig. 6.
figure 6

Task model of the “check weather conditions on the flight path” task embedding the description of potential errors

Each human task and interactive input task is connected to one (or several) genotype(s), indicating possible cause(s) of errors. Genotypes are then connected to phenotypes, which are the observable consequences of the errors. For example, the “Perceive image” perception task is connected to the genotype “Perceptual confusion: image badly or not perceived” (zoomed in view in Fig. 7). This genotype is also connected to the phenotype “Weather target wrongly or not detected”. In the same way, the “Interpret and analyze” cognitive analysis task, which requires particular knowledge to be performed (the “DK” labeled rectangles containing declarative knowledge about relationships between the color of visual artefacts in the navigation display and the composition of the clouds) is connected to the knowledge based mistake “Illusory correlation: No weather problem detected”. This means that a wrong user knowledge association could cause a non-detection of a weather issue on the flight path. And this genotype is also connected to the phenotype “Weather target wrongly or not detected”.

Fig. 7.
figure 7

“Examine map” sub-task of the “check weather conditions on the flight path” task embedding the description of potential errors

5 Benefits and Limitations of the Approach

The stepwise refinement process of task models presented in Sect. 3.4 and its application to the case study in Sect. 4 have demonstrated the possibility to exploit the extended version of HAMSTERS to support identification and description of operator errors on an existing task model.

While this is critical in order to identify parts in a system that might be error prone or parts in the system that are not tolerant to operators errors it is also true that the task models enriched with error artefacts are gathering a lot of information that might decrease their understandability and modifiability. We currently favor the expressiveness of the notation and of the resulting task models than legibility and understandability. These two aspects are currently being addressed at tool level providing multiple filtering mechanisms for hiding (in a temporary way) information that the analyst is not focusing on. For instance, all the information elements can be hidden, as the genotypes and the phenotypes if the current activity is to focus on sequencing of tasks.

The main objective of the approach is to support redesign activities when error prone designs have been identified. Such redesign would take place through an iterative design process involving co-evolution of tasks and systems as presented in [3] but development costs are clearly increased. This is the reason why such an approach would be also useful for supporting certification activities in critical systems. For instance, as stated in [6] CS25-1302 annex E 1-F-1, “Flight deck controls must be installed to allow accomplishment of these tasks and information necessary to accomplish these tasks must be provided” and in CS 25-1309 “stems and controls, including indications and annunciations must be designed to minimize crew errors, which could create additional hazards”. This CS 25 document consists in a list of requirement that have to be fulfilled in order for aircraft manufacturers to go successfully through certification processes (which are managed by regulatory authorities and/or third parties). The two highlighted requirements demonstrate that certification can only be successful using a complete and unambiguous description of operator’s tasks and by ensuring that equipment (called system in this paper) are not error prone.

Finally, it is important to note that the process proposed and its associated tool-supported notation remain a manual expert-based activity. This is made clearly visible by the “is the error credible?” step in the process where identification of errors can only come from deep understanding of operators activities and possible deviations.

6 Conclusion

In this paper we have presented a way of taking into account in a systematic way abnormal user behavior by extending previous work in the area of task modelling and human error analysis and identification.

We proposed the use of several classifications in human error and integrated them into an analysis and modelling process exploiting new extensions in the task modelling notation HAMSTERS. These extensions make it possible to explicitly represent genotypes and phenotypes of operator errors and to describe their relationships.

These contributions have been applied to a real-life case study in the field of aeronautics demonstrating most of the aspects of the contributions. However, errors related to strategic knowledge and errors related to temporal ordering (e.g. the task model describes a sequence of tasks but the operator performs them in parallel) were not presented even though covered by the approach.

As identified in “Benefits and Limitations” section, this work targets at supporting certification activities for critical systems and more precisely cockpits of large aircrafts. However, thanks to the tool support provided by HAMSTERS (which make human error identification and description less resource consuming) the approach is also applicable to other domains where errors are damaging, in terms of human life, economics, prestige, trust …