Keywords

1 Introduction

The work described in this paper is a result of the efforts of the Unified Multi-modal Measurement for Performance Indication Research, Evaluation, and Effectiveness (UMMPIREE) project. The UMMPIREE project is sponsored by the U.S. Army Research, Development, and Engineering Command (RDECOM), Army Research Laboratory (ARL), Human Research and Engineering Directorate (HRED), Advanced Training & Simulation Division (ATSD), Advanced Modeling & Simulation Branch (AMSB). The UMMPIREE research team has a diverse background including experience in cognitive assessment, software engineering, and modeling and simulation.

2 Research Background and Motivation

ARL conducts both pure and applied research of importance to the U.S. Army. As such, ARL has an interest not only in pure science [1, 2], but in how that science can be uniformly applied in real-world applications [3]. The U.S. Army is looking at ways to better understand, characterize, and improve human performance both individually and in teams [4]. Given this research focus and the resulting real-world applications, it is prudent for the Army to also develop ways and means in which to more accurately and consistently assess human performance in all relevant contexts. This is important to assure that the research results are founded in a shared and thorough scientific approach and those results can be transferred validly and effectively to the engineering domain where they can support the warfighters’ needs.

In addition to the challenges of characterizing and improving human performance, the availability of technology relevant to human performance, especially augmentation capabilities, is rapidly increasing. This can be seen in all sectors of society and is already so ubiquitous as to not be noticed. Some simple examples of human performance augmentation include smart phones, internet search engines, and prosthetic limbs and organs. It is expected that this trend will continue and furthermore will be characterized by an increasing presence of machine intelligence and autonomy of the augmentations [5]. This can be seen in the field of robotics and autonomous vehicles [6] to cite an obvious example. It is expected that the increasing demands on human performance [7], coupled with more robust machine intelligence, will strain the ability of prevailing methods and concepts in assessment to keep pace. Assessment is important for at least the following two reasons: (1) the ability to accurately portray performance in engineering and development efforts so the best acquisition decisions can be made; and (2) the ability to adequately train the soldier to perform using the augmentation capabilities in order to meet the objectives of a given mission.

Some of the key concepts that underpin research in human performance and teaming (and the assessment thereof) are well known and have been part of the literature in psychology and related fields for many years [8]. It is the observation of this research team, however, that in order to advance the state of the practice to meet the challenges just discussed that the discipline of assessment will benefit from an increased level of rigor. It is also our observation that one of these key concepts is that of the psychological “construct,” and concepts closely associated with constructs, including operationalization, and construct validity. This field of psychological and social research is too vast and the state of human knowledge too meagre to universalize these concepts for all communities, but we believe that it is possible, even essential, to make progress in standardization across specific communities of interest (COI), especially within the U.S. Army. The UMMPIREE project is addressing how it can support the maturation of these ideas in the context of the Human Machine Teaming (HMT) domain in particular. Part of this effort is the proposed Conceptual Assessment Model (CAM) discussed in this paper.

3 Key Terminology Used in Our Research with Discussion

Before describing the CAM, we present a short survey of selected key terms with definitions that are of use within our research effort. We use commonly available sources from the internet for definitions.

3.1 Key Terminology

Construct.

“Construct, also called hypothetical construct or psychological construct, in psychology, a tool used to facilitate understanding of human behaviour [sic]. All sciences are built on systems of constructs and their interrelations. The natural sciences use constructs such as gravity, temperature, phylogenetic dominance, tectonic pressure, and global warming. Likewise, the behavioral sciences use constructs such as conscientiousness, intelligence, political power, self-esteem, and group culture. […] In a sense, a psychological construct is a label for a cluster or domain of covarying behaviours [sic]. For example, if a student sees another sitting in a classroom before an examination biting her nails, fidgeting, lightly perspiring, and looking somewhat alarmed, the interpretation might be that she is experiencing test anxiety” [9].

Operationalization.

“Operationalization is […] the process of defining a fuzzy concept so as to make it clearly distinguishable, measurable, and understandable in terms of empirical observations” [10].

Construct Validity.

“Construct validity refers to the degree to which inferences can legitimately be made from the operationalizations in your study to the theoretical constructs on which those operationalizations were based” [11]. A further clarification: “Construct validity refers to the extent to which a test, device, or instrument measures what it purports to measure. This impacts the degree to which inferences be legitimately made from the operationalizations in a study to the theoretical constructs on which those operationalizations were based.” The reader is also referred to the seminal paper on this subject by Cronbach and Meehl [12].

Unified Modeling Language (UML).

“The Unified Modeling Language (UML) is a general-purpose, developmental, modeling language in the field of software engineering, that is intended to provide a standard way to visualize the design of a system. […] In 1997 UML was adopted as a standard by the Object Management Group (OMG), and has been managed by this organization ever since. In 2005 UML was also published by the International Organization for Standardization (ISO) as an approved ISO standard” [13].

Conceptual Model (Per Wikipedia).

“A conceptual model is a representation of a system, made of the composition of concepts which are used to help people know, understand, or simulate a subject the model represents. Some models are physical objects; for example, a toy model which may be assembled, and may be made to work like the object it represents” [14].

Conceptual Model (Authors’ Addition to the Wikipedia Definition).

A conceptual model makes explicit and unambiguous the specific concepts which are being examined or represented – and therefore the observables from which data will be collected. It also serves to codify the researchers’ presuppositions about the problem space. Anything not explicitly represented in the conceptual model cannot be fully and correctly reasoned about or analyzed since it will lack objective evidence.

Measurement.

“Measurement is the assignment of scores to individuals so that the scores represent some characteristic of the individuals.” [15] Measurement can also be defined as the assignment of a number or score on a scale to a characteristic of an individual, object, or event to enable comparison to other individual, objects, and events.

Assessment (Educational).

“A tool or method of obtaining information from tests or other sources about the achievement or abilities of individuals. Often used interchangeably with test” [16].

Assessment (Psychological).

“Psychological assessment is a process of testing that uses a combination of techniques to help arrive at some hypotheses about a person and their behavior, personality and capabilities.”

Operationalized Construct (Authors’ Definition).

A construct that has been defined, at least partially, in terms of a finite and discrete set of measurable quantities.

3.2 Discussion

The focus of this paper is on developing a tool, the CAM, that can increase clarity in research involving constructs. It is noted, however, that terms closely associated with constructs, especially measurement and assessment, seem to have multiple meanings and their usage is sometimes conflated one with the other. For purposes of this research we refer to assessment as the broad, overall process that may involve quantitative data from measurements, qualitative data from observations or other sources, and expert (or not so expert) judgment. It is also observed that the usage of terms like assessment vary by context (e.g., psychology or education).

Figure 1 depicts the purpose of a conceptual model in the UMMPIREE project. A conceptual model articulates the finite observables that are used in an assessment thereby both limiting the scope of the assessment and enabling a clear understanding of all the factors in the assessment.

Fig. 1.
figure 1

Purpose of the conceptual model in UMMPIREE

The real world has many features and attributes that may be of interest to a given assessment, so many that the number may approach infinity. The purpose of the conceptual model is to identify and make discrete a finite set of those features and attributes in a way that allows that finite set to be measured. The resulting quantitative data then forms a significant portion of the overall assessment. We recognize that quantitative data alone is not necessarily sufficient for a good assessment and that qualitative and subjective data form key contributions to assessment as well.

4 Conceptual Assessment Model (CAM)

In this section we discuss UML diagrams of the CAM. We use UML merely as a convenient way of articulating a model structure in a conventionally accepted way. In other words, UML is a commonly used modeling technique. We use only two concepts from UML: classes and compositions. The classes are represented as boxes. Classes associated with other classes are indicated by the diamond shape.

Figure 2 illustrates the CAM using a UML representation. The CAM is composed of one Subject Model, one to many Operationalized Constructs, and is associated with one Mission or Assessment Context. This Mission or Assessment context may also influence the Operationalized Construct that is part of the CAM.

Fig. 2.
figure 2

The Conceptual Assessment Model (CAM)

The assumption is that “what is being assessed” is the Operationalized Construct of which there is at least one, but could be several. The purpose of the CAM is not to prescribe any particular method of executing assessment (or experiment), but to increase the level of uniformity across similar assessments by framing the assessment in a common, yet flexible, structure.

Figure 3 illustrates the Subject Model class of the CAM. In this example, the Subject Model is specific to the HMT problem space.

Fig. 3.
figure 3

The CAM subject model

The Subject Model for the HMT problem space is composed of one to many Human Models, one to many machine models, and one Team Model. In addition, there are one to many Human-Machine Interaction modes.

Figure 4 illustrates the Mission or Assessment Context that is an essential element of the CAM and may influence the Operationalized Constructs that compose the CAM.

Fig. 4.
figure 4

The CAM mission/assessment context

This Mission or Assessment Context model is also specific to the HMT problem domain. It is composed of one to many Human Tasks, one to many Machine Tasks, one to many Team Tasks, and a unique (one) Mission (or Assessment) environment.

We define the Operationalized Construct class as shown in Fig. 5.

Fig. 5.
figure 5

The CAM operationalized construct

The Operationalized Construct is composed of at least one ARL Standard Construct or Special Construct, but there could be multiples of each of these standard and special models. The Operationalized Construct is influenced by the Mission or Assessment Context model.

We define the Construct model itself as shown in Fig. 6. Not surprisingly, the Construct model can be complex. It can be composed of multiple theories, although none are required. The only requirement is that an Evidence model is defined.

Fig. 6.
figure 6

The CAM construct model

In summary, the assessment process can benefit from the use of the CAM through the following steps:

  1. 1.

    Identify and detail (provide specificity) the components of the CAM that will be used for a particular assessment.

  2. 2.

    Develop a data collection and measurement plan for each element of the CAM that is identified as useful for the assessment.

  3. 3.

    Articulate how the components and elements relate to one another (e.g., how do the tasks relate to the constructs? What data elements will be used for calculating what assessment measures?) from an analysis perspective.

  4. 4.

    Articulate how these data will be analyzed using Measures of Performance (MOPs) and Measures of Effectiveness (MOEs) and other high level measures.

5 Applying the CAM to a “Trust” Construct

The construct of “trust” and related constructs such as “transparency” occur many times in the literature; the references found in this paper cite only a few [17,18,19,20,21,22]. These constructs are widely used yet in some papers no definition is offeredFootnote 1. It is possible that the reader is expected to have a shared, cultural definition in mind, or that the construct itself is too difficult to define, or that only concepts or measures somehow ancillary to the construct itself can be articulated or operationalized. We presume, however, that in some cases, it will benefit the research to operationalize these well-used constructs as much as possible, even if it means greatly simplifying the situation by putting aside many potential, but difficult to articulate or measure, possibilities.

To explore how the CAM might help develop an explicit definition of trust in a specific context, we imagine a simple, fictional assessment use case. The situation is that we wish to assess “trust” in the context of a single soldier and a robotic mule device that is designed to follow the soldier while carrying a given load.

If this were an actual assessment (or experiment), we would want to determine how we were going to conduct the assessment, what data we would need, and what measures or analysis would need to be observed and calculated. In this hypothetical example, we simply identify some obvious, and presumably “easy,” measures – those measures are highlighted in the tables below.

The tables below represent instantiations of the UML classes described above. Table 1 includes the particular CAM Name (Trust in Soldier-Robot Teams). The Subject Model is the Soldier-Robot. The Mission-Assessment Context Name is “Transport heavy load/field environment.” We identify two operationalized constructs: “Trust – Will Follow” and “Transparency – Soldier Knows State of Robot.”

Table 1. Example trust CAM

In Table 2 our hypothetical example is further developed by describing the “Trust in Soldier-Robot Teams” Subject Model. For this table and subsequent tables, several columns are added. These can be thought of as “attributes” of the model. If there are measurable quantities associated with a particular class, those are identified along with suitable measures. For example, the Human-Machine Interaction Mode – 1 is a Wireless Controller. The Variable is Connectivity and is measured by % time connected. The final column includes other constraints or characterizations that should be associated with a given class.

Table 2. Example subject model

Table 3 describes the example Mission-Assessment Environment. It is comprised of two human tasks, two machine tasks, and one team task. The mission-assessment environment is an open field – in this case a parking lot. (LOS = Line of Sight)

Table 3. Example mission-assessment environment

Table 4 describes the top level of the hypothetical Operationalized Trust Construct.

Table 4. Example operationalized trust construct (top level)

Table 5 describes the hypothetical ARL Basic Two-Party Trust Construct. In this example, the single feature of the Evidence Model is a Reliance Agreement between the two parties. In this case, the reliance agreement is simply a functioning communications device.

Table 5. Example ARL basic two-party trust construct

The Transparency construct can be similarly described, but we do not do so in this paper.

6 Extensions to a Network Approach

Starting at least as far back as Cronbach [12] the relationship between constructs and networks or graphs has been recognized. Recently multiple researchers [23,24,25] have introduced concepts from network theory [26] to this problem space as well and in so doing, greatly enriching the potential for further research. The authors see the potential for data mining of current research, especially if a structure similar to the CAM is used to “normalize” different research approaches and methods. The CAM could then be used in conjunction with network approaches to further discover commonality (and variability) across research and thereby support systemization of constructs and the assessments in which they are used.

7 Further Research

The CAM is a concept that is intended to explore how activities like human or human-machine team assessments may be improved through a more systematic and standardizes approach to defining constructs within a given research or assessment context. Using the UML formalism to define a conceptual model leads to many questions about how constructs are defined and relationships between concepts within such a model. Using a UML class approach is only a beginning at describing some of the static relationships between concepts. UML (or other modeling approaches for that matter) also provide for ways to further delineate static aspects but also dynamic aspects. This could be particularly relevant for a construct such as trust since trust can be expected to vary over time.

The most important future research is an attempt to use the CAM in a real assessment or experimental setting. The “real world” or “in the wild” settings can be expected to introduce many challenges that could easily overwhelm a CAM implementation that is too literal. This in itself is a challenge to any research intended to further systematize the field of human performance assessment, especially in complex, cognition-intensive, and machine intelligence augmented situations. It is the authors’ belief that to continue to make progress in this increasingly complex operational environment, progress must be made in systemization and standardization.

Finally, the authors intend to further explore the potential connections between the CAM and network approaches to constructs, measures, and assessments.

8 Summary and Conclusions

We have presented a Conceptual Assessment Model (CAM) using the UML methodology. The CAM provides a potential tool that could be used in a structured method of assessment. The CAM, or other similar concepts, could be particularly useful across an enterprise in serving to standardize the way constructs are defined and what measures are used to describe them.