1 Introduction

One common notion in combat performance is known as Boyd’s OODA (observe, orient, decide, act) loop, and is drawn from military strategies to present real-time decision-making processes. An elaborative model was later offered by Boyd to comply with more complex forms of combat and feedback loops available in the decision-making process (see Brehmer 2005). In this elaboration, the ‘Observe’ stage consists of multiple information sources, both internal, derived from the unfolding of the circumstances and the immediate environment, and external, outside information provided by others. Observation is often influenced by implicit guidance of higher echelons or sources who may have broader perspective of the mission. In the ‘Orient’ stage, synthesis of information is accomplished; from representing the physical location of various elements in the immediate environment to a mental model of the situation. While not using the same terms, this process is somewhat similar to what is being described as generating the mental simulation in Klein’s RPD model (Klein and Crandall 1995).

1.1 Use of Visualization Concepts to Facilitate “Observe and Orient”/Sensemaking

Integrated visualization concepts have been shown to aid sensemakers (Ntuen et al. 2010) in complex information environments. Baber et al. (2010) proposed a framework representing two parallel cycles; a short Close Target Reconnaissance (CTR) OODA loop cycle and a broader cycle of recording, communicating and interpreting information that feeds into the short cycle and improves sensemaking, situation awareness, and situation understanding in context (e.g., imagery provided by cameras to supplement technologies such as night vision goggles or binoculars). Hence, the input fed into the short CTR loop should be relevant, accurate and timely and the notion that information is “a means to an end, not an end in itself” must be continuously stated. Oron-Gilad and Parmet (2016) have used the OODA loop framework to analyze the decision cycle of dismounted soldiers in a patrol mission who received video feed from an unmanned ground vehicle that was ~20–50 m ahead. They found that the addition of video feed data to the moving dismounted soldier had several detrimental effects on soldiers’ orientation and response to events in their immediate environment, especially ones that were not seen by the technology. Their field evaluation highlights the costs that adding graphical information may have on the observe and orient components of the OODA loop, thus raising the need for new roles in combat-team setups and for additional training when unmanned vehicle sensor imagery is introduced to end-operators. Indeed, judgment and decision processes require shared knowledge as well as efficient communication. Technology enables teams to perform tasks together while they are in different locations and using various communication means. Yet, it also increases the need for common assessments or common mental models of situations during the decision-making process (Mosier and Fischer 2010). The concept of common ground (CG), relate to its contributions to mutual knowledge, beliefs, and assumptions that inspire social and collaborative activities. CG is required for the comprehension of normal conversational interactions and is essential for the coordination of joint actions (Cumming and Akar 2005). Different media provides different resources or affordances that shape communication (Kraut et al. 2002).

1.2 Use of Video to Facilitate Co-presence of Non-military Teams

Fussell et al. (2003) found that pairs work best when they are located side-by side and share full visual co-presence. On the contrary, pairs worked least well when they only had audio communication and they couldn’t see the work area. Some of the benefits of shared visual space can be provided through technology; A scene-oriented camera showing a wide-angle view of the workspace provided significant benefit over audio-only communications. However, a head-mounted camera with eye-tracking capabilities provided little benefit. Moreover, the combination of head-mounted camera and scene camera did not enhance pairs’ effectiveness over the scene camera alone, and in fact led to longer performance times than the scene camera alone. The usage of two cameras caused decision difficulties related to how to distribute attention between them and also confusion in understanding which one was in use. Such findings caution against strategies to create shared visual space through multiple video feeds. The authors concluded that providing a wide-angle static view was the most valuable form for remote collaboration on physical tasks. In another non-military study, the focus was on use of video as a communication mean for capturing the images of remote participants as they perform tasks together. In this type of remote-collaborative work domain, video feed that was used to communicate visual aspects of interaction, such as eye gaze, physical gestures, and facial expressions, had no effect on either the quality of the interaction or the outcome of the task. In situations where visual communication consisted of important content that was needed to improve coordination and collaboration (e.g., a neurosurgical procedure in an operating room) the results were more promising, hence, using video as data rather than as a concept of shared events or as an indication of who is present in an activity, was beneficial. These types of contextual video images provided means to maintain the team members’ (including those who did not have an active role) attention to the operation at any given time and facilitated coordination of fast-paced activities between members of the team (Obradovich and Smith 2008).

Hew (2011) proposed a structured, graphical Data-Tracks-Actions (DTA) representation approach to analyzing C2 teams and their technologies. The graphic representation depicts where and how situation awareness is being formed, how it translates into battlespace actions, and the command roles that form and adapt the end-to-end workflows. In a somewhat similar way to the OODA loop representation for an individual, in the context of perception, sensemaking and action of a team, the graphic representation proposed by Hew can serve to detect roles and intra-team relationships from introducing new technological capabilities. Most importantly to our study, Hew separates between the various communication systems and technologies, their tempo and what they enable. Thus, voice messages can be directed or shared at a tempo of 5–10 and 20–30 s per position. Other communication technologies (e.g., video or imagery) varies in purpose and tempo. To exemplify, Hew (2011) presented a field artilery case study which demonstrated how this analytic approach helps in dealing with issues and opportunities in C2 design.

There are limitations to video systems in providing the shared visual space and those must also be considered (e.g., which visual cues are enabled, what is the field view of each participant, and what level of detail is needed about the work area). While the remote viewers of the video are physically present, they benefit also from contextual cues in addition to the data (hence creating information) from the video. This context could be lost if the team members are not co-located and are provided with the video feed but with no other means for maintaining common situation awareness about the broader context. To sum, practical implications of Obradovich and Smith (2008) study were: “Use video when an accurate and informative picture is needed”. When tasks are assigned between team members, when some have control and others have the data or knowledge, coordination between teams as part of the interdependencies among activities management as well as interactions might be required (Obradovich and Smith 2008).

1.3 Use of Visualization Concepts to Improve Dynamic Decision Making in C3Fire Tasks

Artman (1998) failed to answer the question of whether a graphical or a textual database would improve dynamic decision making in C3Fire (Communication, Command and Control) tasks. However, he concluded that the team’s situation awareness composes of coordination and trust in others, and understanding in what way the team’s actions (e.g., interactions or consequences) affect the dynamic system development. Granlund et al. (2011) further used the C3Conflict simulation environment which is based on the C3Fire microworld (“simulated environments that realistically capture important characteristics of a real system including the complex, dynamic and opaque characteristics of decision making problems”), but was tailored to the military domain by presenting situations with a military cover story and appearance. This environment is a two-sided game where one of them can be hostile. The C3Conflict adds analysis abilities such as: the effectiveness of the teams, the information distribution in the team, and the team’s work and collaboration methods.

Large-scale digital command and control systems often suffer from suboptimal performance due to Interface problems. Walker et al. (2010) found that with the relatively simplistic voice communication, more data was transformed into useful information than with the highly complex digital communication. People prefer a simple interaction that enables to do complex tasks quickly, rather than a complex interaction that only allow to do simple things with considerable effort. The digital communication layer carried significantly higher proportion of data compared to the voice layer, which in turn carried a greater proportion of information. Furthermore, “data” was received by the Brigade headquarters (96%) compared to a greater proportion of “information” leaving it (26%). Walker et al. (2010) suggested to reconsider the way that digitization should be perceived, designed, and operationally prepared - a human-centric view rather than a techno-centric view (in which capability is viewed in terms of technological advancement).

The goal of the current experiment was to examine whether adding simple one-directional graphic communication capabilities to an existing verbal communication channel between an observer and an attacker team members will improve mission performance (as defined by objective performance measures) and reduce the stress level of the participants (as measured by the DSSQ). The observers were provided with the technological capability to send still images to the attackers, in one of two forms (as taken, or with annotations). Delivery of still images was compared to two other alternatives: sending target coordinates (baseline) and pointing the target directly on the attacker’s imagery (augmentation).

It was hypothesized that: (1) the availability of still images will improve the communication between the observer and the attacker; (2) adding the ability to graphically mark elements on the still image (annotations) will improve the communication and shorten the mission cycle, even more than still images alone; (3) the verbal communication pattern among team members will change due to the presence of the graphic communication; and (4) when participants will have a free choice of communication means they will prefer the still images with graphic markings on the other three alternatives (coordinates, augmentation on reality and still images without graphic markings).

2 Methodology

2.1 Participants and Allocation to Teams

Eight teams of three teammates: ‘attacker’ (dismounted/mounted soldier), ‘observer’ (operator of a remote device, aerial or ground), and experimental ‘controller’ (higher echelon commander). Each one of the attacker and observer participants took part in one experimental day. The same controllers participated as experimental confederates in all 3 experimental days. There were four teams of Tanks as attackers; two with Tablet observers, one with Coral, and one with UAS. There were four teams of Spike-MR attackers; two teams were Coral, one with Tablet and one with UAS observer. Participants were all soldiers or military reserve soldiers who have been on active duty in the year prior to the experiment. Each participant was assigned to a simulated work station according to his/her expertise. E.g., UAS oper-ator was assigned to operate the UAS etc.

2.2 Instruments and Apparatus

A set of questionnaires was administered at the end of the experiment. It consisted of: (1) personal characteristics and military experience questionnaire, (2) a 15 items usability questionnaire based on the SUS – System Usability Scale (Brooke 1996) using a five-point scale from 1 “Strongly disagree” to 5 “Strongly Agree”, (3) a 20 items Dundee Stress State Questionnaire (DSSQ; Matthews et al. 1999, 2002) using “0” (Definitely false) to “4” (Definitely true) scale, after each round, and (4) Quality of communication questionnaire, which had two different versions; one for the soldiers and the other for the operators (based on Fiore et al. 2003).

2.3 Procedure

Participants arrived at the IDF battle lab for approximately one day of experimentation (~7–8 h total), including briefing and training (~1 h), short breaks between and within experimental rounds (2 h each, 2 rounds, 3rd round ~40 min, and a lunch break). Rounds #1 and #2 consisted of four sessions, each with different type of communication method (coordinates, augmentation on reality, still images with and without marking). The order was counterbalanced between experimental sessions. Each participant was given oral instructions about the task prior to beginning the experimental trials. The instructions included information such as the background and motivation for the experiment, its duration and pay-ment, ethics, safety and confidentiality. They had to sign an informed consent be-fore participating.

A practice session took place (Round #0, “training” scenario) before the experimental trials began. The practice session was used to introduce participants to their teammates, and to get familiar with the general tasks and tools. Participants were asked to fill in the DSSQ after each of the four rounds, and after the simulation parts ended, participants were asked to fill in all other questionnaires (Personal characteristics & military experience, Quality of communication questionnaire, and SUS - System Usability Scale). Finally, participants were asked if they would like to provide any additional comments or concerns orally, they were debriefed and compensated and the experimental session ended.

3 Results

3.1 Objective Results (Simulation)

A total of 400 targets were programmed and planned for the three rounds and of them 305 (~76%) were executed during the experimental sessions. Hence, none of the teams completed all possible targets. Of the 305 targets, 100 targets out of 128 (78%) were executed within round #1, 152 out of 208 (73%) within round #2 and 53 targets out of 64 (83%) within round #3. Note that round #3 was shorter than rounds #1 and #2.

With regard to viewing angle difference (i.e., where the observer was located relative to the attacker in the ground-ground teams), the distribution across rounds #1, #2, and #3 was 100 targets of narrow angle (up to 30°), 132 targets of mid angle (30–80°), and 73 targets of wide angle (80–150°).

Objective results were analyzed in three aspects: execution analysis (i.e., whether there was fire or it was ceased by the controller of the experiment), response time to acquire a target, in seconds, and accuracy of execution in meters (for acquired targets). In some specific cases of the attacker device simulations, the implementation of augmentation on the target in the simulation turned out to be problematic, as there were discrepancies in the position of the augmentation and the target itself, implicating on objective performance. Therefore, this information was excluded from the statistical analysis related to execution and accuracy distance.

Execution Analysis.

The execution analysis includes rounds #1 and #2 and is detailed separately for teams with the Spike-MR (dismounted) attacker and teams with the Tank (mounted) attacker. Execution was defined as a binary variable (1-for fire 0-for cease fire).

Dismounted attacker.

Table 1 and Fig. 1 detail the number of executions that ended either with fire or ceasefire (i.e., time run outs), for the Spike-MR as attacker and the different observer types. Different patterns of performance can be seen for the different team combinations. Fire was executed in 73% of the cases on average. A logistic regression within the framework of the GLM (generalized linear model) was chosen for analysis. The model yielded a marginally significant effect for communication ability (p < .08) only for the Spike-MR+UAS configuration. No other effects were significant. Hence, for the Spike-MR+UAS team there was a trend for fewer cease fires in the coordinates (baseline condition) relative to both still images types. In contrary, in the Ground-Ground teams (Spike-MR + Coral and Spike-MR + tablet), there seemed to be more fire executions using still images (either with or without markings) compared to the coordinates and the augmentation of location communication abilities, but these were not significant differences.

Table 1. No. of executions (fire and ceasefire) per teams of Spike-MR as an attacker, the different observers by communication means.
Fig. 1.
figure 1

Distribution of executions (fire and ceasefire) per teams of Spike-MR as an attacker and different observers and communication means.

Mounted attacker. Table 2 and Fig. 2 detail the number of executions that ended either with fire or ceasefire since time runs out, for the teams of Tank and the different observers. As can be seen, more events ended with fire in the case of the Tank as an attacker (92% on average). Using the logistic regression analysis here yielded no significant main effects for communication ability.

Table 2. No. of executions (fire and ceasefire) per teams of Tank as an attacker and different observers and communication means
Fig. 2.
figure 2

Distribution of No. of fire and ceasefire per Tank attacker & different observer teams and communication means

Free Round analysis.

The last round (#3) enabled participants to choose any of the four communication means (coordinates, still images, still images with markings, and augmentation on reality). Here, fire was executed in 43 out of 53 (81%) of the cases, and the other 19% of the cases ended with no fire because time has run out. The results clearly show preference for the still images communication ability with 33 (77%) fires using still images with markings and 9 (21%) using still images without markings. The ‘augmentation on reality’ communication was used only once (but recall also the comment about its implementation accuracy, hence this type of augmentation is very sensitive to the accuracy of augmentation implementation).

Response time analysis.

The time to acquire a target was measured as the time from the beginning of the trial till execution. According to Parmet et al. (2014) the way most common response time analysis methods treat cases of no response (i.e., missing data) is inaccurate, and therefore they suggested using an alternative analysis technique, named survival analysis, that lead to more reliable and robust conclusions. Survival analysis is a branch of statistics which deals with death in biological organisms and failure in mechanical systems. Generally, survival analysis involves the modeling of time to event data, in our case, the event is the participant’s response (or no response) to a traffic scene. Survival analyses are statistical methods and procedures that accommodate censored data. Procedures that treat differently the information gained from uncensored and censored observations.

We fitted the Cox proportional-hazards regression model Cox (1972) which is the most common tool for studying the dependency of survival time on predictor variables. The initial model included the Communication ability (Coordinates, Still images, Still images with annotations, and augmentation on reality), the angle, the interaction between the two. Interaction was not statistically significant for any of the models and therefore removed from the analysis. The main effect for communication ability was found statistically significant in the cases of Spike-MR + UAS team and the Tank + Coral team, see Fig. 3.

Fig. 3.
figure 3

*-Survival analysis for time to acquire a target. Significant effects for communication ability were found in the Tank + Coral (Observer) team configuration and in the Spike-MR (NT)+UAV team configuration.

Accuracy analysis.

The accuracy of target acquisition was measured by the distance between the target and the impact point. A logarithmic scale of the distance was used to display the data (Fig. 4).

Fig. 4.
figure 4

Distance from target in meters (logarithmic scale) for communication ability and the various team configurations.

Utilizing a GLM analysis on the log of the distance from target acquired (normally distributed) with communication ability (Coordinates, Still images, Still images with annotations) as the predicting variable, statistical main effects for communication ability were found for the Spike-MR & Coral team configuration (F(3,37) = 2.25, p < .097), Spike-MR & Tablet team configuration (F(3,17) = 5.22, p < .0097), where in both, still images only and still images with annotations, yielded shorter distances from the target than the coordinates.

With regard to the viewing angle, it was not balanced well across communication ability conditions by configurations, as can be seen in Fig. 5. Nevertheless, there is a trend showing that the accuracy distance of trials with still images were less sensitive to viewing angle than the other communication ability means. This finding needs to be replicated in future studies before making any clear statement.

Fig. 5.
figure 5

Distance from target in meters (logarithmic scale) as a function of angle for the Spike-MR and Coral configuration (top) and the Spike-MR and tablet configuration (bottom).

3.2 Subjective Results

SUS (usability evaluation).

After completing the entire experimental session and using the system in all four possible modes of communication participants had to rate the usability of the system. They were asked to record their immediate response to each of the 15 items on a 5-points Likert scale. SUS yields a single number representing a composite measure of the overall usability evaluation of the system. SUS scores ranged from 55 to 100 (out of 100). Participants’ (both attackers and observers) average score was 88 (SD = 9). Hence, overall participants were satisfied with the communication user interface.

DSSQ (stress evaluation).

This questionnaire is concerned with participants’ feelings and thoughts while performing the task (on 0–4 scale). The DSSQ measures three aspects of subjective stress; task engagement (related to task interest and focus: energetic arousal, motivation, and concentration), distress (integrates unpleasant mood and tension with lack of confidence and perceived control), and worry (composed of self-focused attention, self-esteem, and cognitive interference). The DSSQ was collected after each experimental round (i.e., 4 times), hence, percent of change over the course of the experimental day could be calculated. The overall averages were 24 (ranged 9–28, SD = 4), 7 (ranged 0–17, SD = 4) and, 4 (ranged 0–16, SD = 4), respectively for task engagement, distress and worry. The DSS scores after each round are detailed in Table 3. Note that the maximum ‘engagement’ scores which can be achieved in the DSSQ is 28. Therefore, it seems that, on average, participants were highly motivated and engaged in doing the task. The potential highest scores for ‘distress’ and ‘worry’ are 28 and 24 (respectively). This can point out that participants were pretty relaxed and became progressively even less worried as the tasks progressed.

Table 3. Average and SD results for the DSSQ by round (highest possible scores are 28, 28, and 24, respectively)

Quality of Communication Questionnaire.

In view of the problematic implementation, attributable to discrepancies in the position of the augmentation and the targets, the following analysis of communication quality excludes the ‘Augmentation of target location on reality’ mode. Two different versions of a subjective assessment of the communication quality were used; one for the attackers and one for the observers. The questions were divided into three groups – cooperation items (4 items in both the attackers’ and the observers’ versions, three of them were identical), coordination items (4 items for both attackers and observers, three of them were identical), and performance items (4 items in the attackers’ version and 2 items in the observers’ version).

The three shared ‘cooperation’ items were aimed at evaluating team work and interaction (i.e.; “Team cooperation was good”, “We used the same jargon”, “A unique common language was created between us”). Figure 6 presents the different patterns of evaluation among the different teams and communication means.

Fig. 6.
figure 6

Quality of Communication - Average evaluations of the cooperation items (N = 3) per teams and communication means

The three common coordination items (i.e.; “It was necessary to use verbal communication to acquire the target”, “We worked in specific sequential order”, and “The verbal communication was based on still images”, note: the last item was relevant only within the still images comm. means) aimed at evaluating the advantages (or disadvantages) of the different communication means and the teams’ working techniques. Figure 7 presents the different patterns of evaluation among the different teams and communication means.

Fig. 7.
figure 7

Quality of Communication - Average evaluations of the coordination items (N = 3) per teams and communication means

Participants scored low (Average = 1.2, STD = 0.9) on the fourth coordination item (attacker version: “I was overloaded and couldn’t use all still images I received”, and observer version: “I felt that the attacker did not use the still images I sent but trying to “figure out” by himself”) indicating they favored the still images communication. The average of all 6 common items across all teams and including coordinates and still images (with and without markings) communication means was 4.1 (on a 5 Likert scale), which means that the participants were highly coordinated and managed to create unique communication.

As for the self-performance evaluation, the analysis was done for the observers and the attackers separately. The performance items in the observer version were: “I was able to understand the attacker point of view” and “I was able to instruct the attacker based on his point of view”. The results show that the observers preferred the still images (either with or without markings) compared to coordinates (See Fig. 8 top). The average of all observers across all performance items was 3.6 (STD = 0.8). The performance items of the attackers which were included in the analysis (i.e., “It was difficult for me to acquire the target’s surrounding”, “It was difficult for me to acquire the target itself”, and “I felt confident with the target acquisition”) had an average of 3 (STD = 0.4) across all attackers. See Fig. 8 bottom for the detailed results. The attackers’ scores on the fourth performance item - “I felt confident attacking the target based on pictures”, which was relevant only for the still images (with and without markings) communication means, were high (Average = 4.3, STD = 1) with no differences between the still images only and the still images with markings.

Fig. 8.
figure 8

Quality of Communication - Average evaluations of the performance items (N = 2) per observers (top) and (N = 3) per attackers (bottom), and communication means.

3.3 Verbal Communication

The verbal communication channel between the attackers and the observers was available throughout the experiment. The data was measured by the total number of ‘ping-pong’ transmissions between the attackers and the observers and by the percent of time verbal communication was in use. Total of 5702 ‘ping-pongs’ took place while 56% within the Spike-MR attacker and 44% by the Tank. In addition, 39% within the Coral, 37% via tablet and 24% by the UAS. See Figs. 9 and 10 for the detailed results and the different patterns among the various teams and communication means.

Fig. 9.
figure 9

Communication volume as measure by number of ‘Ping-Pong’ transmissions per team (attacker-observer) and communication means

Fig. 10.
figure 10

Communication volume as measure by percent of transmission time per team (attacker-observer) and communication means

The controllers were asked to scale their impression of the necessity of verbal communication between the observers and the attackers for target acquisition. The results are shown in Fig. 11.

Fig. 11.
figure 11

Necessity of verbal communication - Average of the controllers’ evaluations per communication means

4 Conclusions

In this study, we examined the added value of using one-way still images graphic communication between a ground soldier (represented by Tank (mounted) or Spike-MR (dismounted) attackers) and an operator (represented by Coral, tablet or UAS observers). The abilities to communicate by using still images with or without adding markings were compared to a basic communication based on coordinates and a future solution (‘location on reality’ – augmentation of target location on reality).

Objective and subjective measures were obtained. From those, it seems that in the ground-ground teams, operators benefited from the use of still images especially in the time to acquire a target and the accuracy distance (see Tables 1 and 2 and Figs. 3 and 4). In contrary, still images from the UAS were less beneficial to the ground operators relative to coordinates in terms of time to acquire a target and executions. The use of still images may improve if: a) bi-directional graphic communication will allow operators on both sides to send or annotate sent images, and b) if annotations were structured to enhance situation awareness, for example by using a fixed set of tools or annotation symbols (e.g., Granlund et al. 2011). Overall, observer-type participants favored the use of still images in the free choice round #3, they chose to use the still image configurations almost in all cases (98%).

Communication within a team was verbal (two-way) at all times and graphic (one-way) communication when applicable. As noted from Figs. 6, 7, 8, 9, 10 and 11, there were differences in the way the observers and the attackers perceived the quality of communication, and there were differences, between the mounted and the dismounted attackers as well. All in all, it seems that the addition of graphic communication increased the number of ping-pong chats between team members, but on average, shortened the duration of the chat. From the perspective of the controllers, who were supervising the teams, it seems that the Ground-Ground teams needed less verbal communication compared to the Ground-Aerial teams. Furthermore, less verbal communication was required when using the still images (with or without markings) interface compared to coordinates.

With regard to our hypotheses we can conclude that; (1) Still images may improve the communication between Ground-Ground teams; (2) In some cases, adding the ability to graphically mark elements on the still image (annotations) improved the communication in terms of self-evaluation of team cooperation and performance, even more than still images alone; (3) In presence of graphic communication the verbal communication patterns have changed with more ‘ping-pong’ transmissions between team members although shorter ones; and (4) when participants had a free choice of communication means they preferred the still images with graphic markings on the other three alternatives (coordinates, augmentation on reality and still images without graphic markings). These findings are consistent with our previous communication study (Oron-Gilad and Oppenheim, Final research report 2015) in few aspects:

  • The use of rather simple communication means like temporary markings on the video feed in the previous experiment, and still images in the current one, was beneficial, at least in ground-ground settings. Which is why it is important to continue and examine the use of simple means of graphic communication that are simpler to implement.

  • It is important to develop a structured interface which emulates the relevant available information, and to create a common terminology in order to simplify the graphic communication and allow convenient utilization of the new tools.

  • The results confirm the inherent differences between Ground-Ground communication needs and those of Aerial-Ground teams in terms of different perspectives and shared views. This is not a new problem (See Oron-Gilad et al. 2011; Ophir-Arbelle et al. 2012 for example) but it requires attention.