Keywords

1 Introduction

There are numerous domains in which professionals must search through visual displays of information to identify key features and make decisions about them. From radiologists looking for evidence of cancer to radar analysts searching for unusual radar signatures, many high-consequence decisions rely on visual cognition. Despite the importance on these tasks, few studies have investigated the relationships between professional visual search experience and visual cognition. Although a great deal of research has focused on characterizing visual search performance in humans, most studies have used college students with no particular expertise in visual search tasks beyond those encountered in daily life [13].

Only a handful of studies have investigated the performance of people with professional experience in high-consequence visual search tasks. Work by Mitroff and colleagues has investigated the visual search performance of Transportation Security Officers (TSOs) who have experience with searching X-ray images of luggage for prohibited items [4]. However, in most of these studies, the TSOs completed domain-general visual search tasks rather than tasks from within their area of expertise [4, 5]. Other researchers have focused on radiologists who have experience with searching images for evidence of cancer or other abnormalities [69]. In that domain, participants typically view images from within their area of expertise, such as mammograms. Very few studies have tested radiologists using domain-general tasks.

There are even fewer studies that compare professional visual searchers to novices [4, 10, 11]. The lack of research in this area represents a substantial gap in the literature. A deeper understanding of the interplay between experience and visual cognitive processes would benefit numerous areas, including training, assessment, and system design. Further research in this area could also help to address fundamental questions about learning. While some researchers argue that professional visual search experience does not alter visual cognition [12], there is evidence from other domains suggesting that domain-specific experience can influence visual processing, even on a neural level [13].

To add to the existing research on the effects of professional visual search experience on basic cognitive processes, we sought to test the visual search performance of participants with varying levels of experience in a previously unstudied domain. In the present study, participants completed a battery of visual search and general visual cognition tasks. One of the visual search tasks used Synthetic Aperture Radar (SAR) imagery. SAR is used in a variety of surveillance and mapping applications. The imagery is superficially similar to optical imagery, but extensive training is required for analysts to learn to interpret SAR phenomenology correctly This makes it particularly useful for investigating differences in visual search between experts and novices. In addition, unlike X-rays and other more common forms of imagery, most people have never seen a SAR image and are true novices in this domain (Fig. 1).

Fig. 1.
figure 1figure 1

SAR image of a baseball diamond. Image courtesy of Sandia National Laboratories, Airborne ISR.

There were three groups of participants in this experiment. The first group consisted of professional SAR imagery analysts who conduct visual search tasks using SAR imagery in their daily jobs. They have had extensive training and experience working with SAR imagery. The second group consisted of professionals who work with SAR images regularly, typically on a weekly basis. They had extensive knowledge of the domain, but do not typically engage in visual search tasks using the imagery. Most of the participants in this group were radar engineers who design and test the SAR systems. The third group consisted of novices who had no prior exposure to SAR imagery. All three groups completed three behavioral tasks and three eye tracking tasks. The behavioral tasks tested basic aspects of visual cognition, including visual attention, mental rotation, and spatial working memory. The eye tracking tasks included two domain-general visual search tasks and the domain-specific visual search task using SAR imagery.

2 Method

2.1 Participants

Twenty-four people participated in the study. Eight were professional SAR analysts (1 female; mean age 42), eight were employees of Sandia National Laboratories who work with SAR images regularly but are not trained as imagery analysts (3 female; mean age 41), and eight were Sandia employees with no prior experience with SAR imagery (4 female; mean age 32). In the analysis, we will refer to these groups as the analysts, experienced non-analysts, and novices, respectively. All participants gave their written informed consent before participating in the study.

2.2 Domain-Specific Visual Search Task

In the domain-specific visual search task, participants were presented with two SAR images presented side-by-side. The left side of the screen showed a SAR image of a scene and the right side of the screen showed a Coherent Change Detection (CCD) image of the same scene. The CCD image is created by co-registering SAR images of the same scene and measuring changes in coherence that can reveal temporal changes [14]. Essentially, the left image provided viewers with contextual information about the scene and the right image provided viewers with information about the presence or absence of targets in the scene. There were 20 pairs of images, half of which contained a target and half of which did not. The targets were the same types of targets that the professional SAR analysts look for in their daily work. The experienced non-analysts were also familiar with the nature of the targets. The novices received instructions about what to look for to determine whether or not a target was present in the scene. Each stimulus was preceded by a fixation cross that appeared on the screen for one second. When the stimulus appeared, participants were asked to search for any targets and to use a 1–4 scale to indicate whether or not they thought a target was present. A response of “1” indicated that they were sure that there was not a target in the scene. A response of “2” indicated that they thought there was no target, but they were unsure. A response of “3” indicated that they thought there was a target present, but were unsure. A response of “4” indicated that they were sure that there was a target present. The participants were asked to respond as quickly and accurately as possible. They had a maximum of 45 s to respond to each stimulus and they did not receive feedback about their answers.

2.3 Domain-General Visual Search Tasks

Participants completed two domain-general eye tracking tasks. The first, the O and Q task, was based on classic visual perception studies [15]. The O and Q task consisted of 120 trials, 60 in which participants were asked to search for the letter O hidden among a field of Qs and 60 in which participants were asked to search for the letter Q hidden among a field of Os. The order of the O block and the Q block was counterbalanced across participants. In both blocks, each stimulus contained 1, 4, 8, 16 or 24 letters. Examples of stimuli from each block are shown in Fig. 2. One letter was always the target and the remainder were distractors. The target appeared equally often in each quadrant of the screen and the stimuli were presented in a pseudorandom order so that no more than two stimuli of the same set size appeared in a row. Each stimulus was preceded by a fixation cross that appeared on the screen for one second. When the stimulus appeared, participants were asked to find the target as quickly as possible and to press the space bar when they had found it.

Fig. 2.
figure 2figure 2

Examples of the stimuli used in the O Task (left) and the Q Task (right), both with a set size of 24.

The second domain-general task was the T and L task, which has been used in prior studies of professional visual searchers [4]. In the T and L task, participants were tasked with determining whether or not there was a perfect T, with a centered crossbar, in the image. The distractors were offset Ts that looked more like Ls. The targets and distractors appeared in grey against a mottled grey background and could be oriented in any of four directions (0°, 90°, 180°, 270°). Each image contained 4, 8 or 16 letters. Examples of stimuli from each set size are shown in Fig. 3. There were 48 trails, half of which contained a target. The targets appeared equally often in each quadrant of the screen. As in the other eye tracking tasks, each stimulus was preceded by a fixation cross that appeared on the screen for one second. When the stimulus appeared, participants were asked to respond as quickly and accurately as possible by pressing the “z” key on the keyboard if a target was present and the “m” key if the target was absent.

Fig. 3.
figure 3figure 3

Examples of the stimuli from the T and L task, showing set sizes 4, 8, and 16

Behavioral (accuracy and reaction time) and eye tracking data were recorded for both tasks.

2.4 Domain-General Visual Cognition Tasks

The participants completed three visual cognition tasks that tested mental rotation, visual attention, and spatial working memory. In the mental rotation task, participants saw pairs of line drawings representing three-dimensional figures. They were asked to mentally rotate the figures to determine whether or not the two figures were the same. They responded by clicking the left or right mouse button. Participants were asked to complete as many trials as possible in 60 s. Their scores were calculated by subtracting the number of incorrect responses from the number of correct responses (Fig. 4).

Fig. 4.
figure 4figure 4

Example stimulus from the mental rotation task

In the visual attention task, participants fixated on a point in the center of the screen. An array of squares, radiating away from the fixation point along eight arms, was presented for 30 ms. All of the squares were white, except for one that was black. The stimulus was followed by a visual mask. Then participants were asked to indicate which of the eight arms contained the black square. There were a total of 24 trials, 8 at each of three distances from the fixation point (close, middle and far). Participants were scored based on how many trials they answered correctly at each distance (Fig. 5).

Fig. 5.
figure 5figure 5

Timeline for the visual attention task

The spatial working memory task was a rotation span task based on work by Shah and Miyake [16]. There were two interleaved tasks. In the memory task, participants saw arrows that were one of two lengths (long or short), pointing in one of eight directions (0°, 45°, 90°, 135°, 180°, 225°, 270°, 315°). In the secondary task, participants saw one of five letters (R, L, J, G, and F) that was either normal or backwards (flipped across the vertical axis) and were rotated in one of eight different orientations (0°, 45°, 90°, 135°, 180°, 225°, 270°, 315°). On each trial, participants saw a sequence of letters, each of which was followed by an arrow. For each letter, the participants had to press a key on the keyboard to indicate whether the letter was presented normally or backwards. The letter remained on the screen until the participant made a response. Then an arrow was presented for 1000 ms. The trials varied in length and contained between two and five arrows. After the last arrow was presented, participants were asked to recall the sequence of arrows that they had seen in that trial. The recall screen showed all 16 possible arrows (long or short arrows at each of eight orientations). Participants clicked on the arrows to indicate which arrows had appeared in the previous sequence, in the order that they appeared. Participants were scored based on the number of arrows that they recalled correctly (Fig. 6).

Fig. 6.
figure 6figure 6

Timeline for the spatial working memory task

2.5 Eye Tracking Methods

The eye tracking data was collected using the FaceLab 5 Standard System and EyeWorks software. The eye tracker was mounted below the computer monitor and consisted of two miniature digital cameras and one infrared illumination pod. The system was calibrated for each participant by having the participants look at each point of a 9-point grid displayed on the monitor. After calibration, observers were free to move their heads while completing eye tracking tasks.

3 Results

3.1 Domain-Specific Visual Search Task

Behavioral Results. On the SAR imagery search task, the novice participants responded correctly to 56.9 % of the trials, the experienced non-analysts responded correctly to 70.0 % of the trials, and the analysts responded correctly to 74.4 % of the trials, on average. A one-way ANOVA showed that the average accuracy differed significantly between groups (F(2,21) = 4.62, p < 0.03). Post hoc t-tests showed that the analysts had significantly higher accuracy than the novices (t(14) = 2.95, p < 0.01), as did the experienced non-analysts (t(14) = 2.14, p < 0.03). The performance of the analysts and experienced non-analysts did not differ significantly (t(14) = 0.73).

The average reaction times were 22.4 s for the novices, 14.5 s for the experienced non-analysts, and 9.5 s for the analysts. A one-way ANOVA showed that the groups differed significantly in their reaction times (F(2,21) = 11.98, p < 0.001). Post hoc t-tests showed that the analysts were significantly faster than the experienced non-analysts (t(14) = 2.93, p < 0.01) and the novices (t(14) = 4.34, p < 0.001), and the experienced non-analysts were significantly faster than the novices (t(14) = 2.57, p < 0.02).

Eye Tracking Results. Two participants, one from the novice group and one from the experienced group, were excluded from the eye tracking data analysis due to noisy data. A region of interest (ROI) was demarcated around each target and contained the target itself plus a buffer intended to represent a person’s useful field of view (approximately 90 pixels on each side of the target). Time to first fixation in the ROI was calculated for each trial in which a target was present. The average time to the first fixation in the ROI was 5.3 s for novices, 3.0 s for experienced non-analysts, and 2.1 s for analysts. The difference between groups was significant (F(2,19) = 9.21, p < 0.01). Post hoc t-tests showed that the experienced non-analysts and the analysts were both significantly faster than the novices (t(12) = 2.41, p < 0.02 and t(13) = 4.36, p < 0.001, respectively). However, the experienced non-analysts and the analysts did not differ significantly from one another (t(13) = 1.53, p = 0.08).

For each trial, we calculated the percentage of total fixations that occurred within the ROI. On average, 17.4 % of the novice’s fixations were in the ROI, compared to 25.3 % for the experienced non-analysts and 38.9 % for the analysts. The difference between groups was significant (F(2, 19) = 8.08, p < 0.01). Post hoc t-test showed that the experienced non-analysts had a significantly higher percentage of fixations in the ROI than the novices (t(12) = 2.47, p < 0.02) and the analysts had a significantly higher percentage of fixations in the ROI than the experienced non-analysts (t(13) = 2.13, p < 0.03).

3.2 Domain-General Visual Search Tasks

The participants’ average response times for each target type and set size in the O and Q task are shown in Fig. 7. A one-way ANOVA showed that the three groups’ average reaction times on the O block did not differ significantly (F(2, 14) = 0.31). However, the average reaction times of the three groups did differ significantly on the Q block (F(2,14) = 4.10, p = 0.04). Post hoc t-tests showed that the reaction times of the participants in the novice group were significantly slower than those of the experienced non-analysts (t(8) = 2.56, p = 0.02) and those of the analysts (t(8) = 2.15, p = 0.03). The analysts and experienced non-analysts did not differ significantly from one another (t(8) = 0.59).

Fig. 7.
figure 7figure 7

Average reaction times for the O and Q tasks

On the T and L task, novice participants responded correctly to 83.3 % of the trials, experienced non-analysts responded correctly to 85.4 %, and analysts responded correctly to 83.0 %, on average. The differences in accuracy between the groups were not significant (F(2, 19) = 0.28). The average reaction times for the groups were 4.7 s for the novices, 4.1 s for the experienced non-analysts, and 3.5 s for the analysts. A one-way ANOVA showed that there was a significant difference in average reaction times between the groups (F(2,19) = 3.61, p < 0.05). Post hoc t-tests showed that the analysts were significantly faster than the novices (t(12) = 4.49, p < 0.001).

3.3 Domain-General Visual Cognition Tasks

Due to time constraints, only five of the eight participants in the analyst group completed the visual attention task. The novice participants averaged 6.5 correct for the close trials, 7.25 correct for the middle trials, and 7 correct for the far trials. The experienced non-analysts averaged 7.25 correct for the close trials, 8 correct for the middle trials, and 7.88 correct for the far trials. The analysts averaged 6.8 correct for the close trials, 7.2 correct for the middle trials, and 7.6 correct for the far trials. One-way ANOVAs were used to compare the scores of the three groups for each trial type. These showed that the performance of the three groups did not differ significantly for any of the trial types (all Fs < 2.83, all ps > 0.09).

On the mental rotation task, the novices completed an average of 6.75 trials with an average score of 3 (the number of correct trials minus the number of incorrect trials). The experienced non-analysts completed an average of 10.25 trials with an average score of 5.5. The analysts completed an average of 12 trials with an average score of 5.5. One-way ANOVAs showed that there were no significant differences between groups in the number of trials completed (F(2,21) = 2.24, p = 0.13) or in score (F(2,21) = 1.43, p = 0.26).

On the rotation span task, the novices correctly recalled 54.8 % of the arrows, the experienced non-analysts correctly recalled 49.7 % of the arrows, and the analysts correctly recalled 50.4 % of the arrows, on average. Two of the analysts and one of the experienced non-analysts did not complete the rotation span task due to time constraints. A one-way ANOVA showed that the three groups did not differ significantly in performance (F(2,18) = 0.19).

4 Discussion

The results of this study indicate that professional experience in a specific visual search domain corresponds with improved performance on domain-general visual search tasks. Working within their domain of expertise, the SAR imagery analysts and experienced non-analysts were both more accurate in their responses than the novices, who had not viewed SAR imagery before taking part in the experiment. In addition to their high accuracy, the analysts were faster than experienced non-analysts and novices, both in terms of overall task reaction time and in terms of the time to first fixation in the ROI. The analysts were highly efficient in their ability to identify the ROI, typically fixating in the ROI within two seconds of stimulus onset. They devoted a higher proportion of fixations to the ROI than either of the other groups.

The analysts and experienced non-analysts also out-performed the novices on the domain-general visual search tasks. The analysts had significantly faster reaction times than the novices on the T and L task, and both the analysts and experienced non-analysts were significantly faster than the novices on the parallel visual search component of the O and Q task. However, on the domain-general visual cognition tasks that were not directly related to visual search, the performance of the three groups did not differ.

Our finding that that professional imagery analysts were faster than novices on domain-general visual search tasks stands in contrast to some prior literature. One of the few prior studies to compare professional visual searchers to novices [4] found that the professionals were more accurate than the novices, but significantly slower. However, these studies differ in several important respects, most notably with regard to the groups of professionals used. The present study indicates that experience with non-optical imagery may improve visual search capabilities in ways that benefit performance on general visual search tasks, but the differences between this study and the work by Biggs and Mitroff suggest that the effects of experience may depend on the domain. Different professional domains are likely to draw on slightly different cognitive processes or skills, leading to different effects on domain-general tasks. More research is needed to tease apart the influences of various kinds of visual search experience on specific aspects of visual cognition.

The work described in this paper is part of a broader set of research activities at Sandia National Laboratories related to real-world visual search. We are developing a cross-domain research program, focused on populations of professionals with varying levels of experience who spend a predominant amount of their work time searching for anomalies and/or known signatures in imagery or other visual representations of data. As discussed in McNamara et al. (this volume), our team is exploring integrated qualitative and quantitative methodological frameworks that will enable us to balance experimental rigor with work domain realism. The goal of this research is to inform the design and implementation of real-world work tools, training, and processes to best support the complex, often stressful work of rapid visual anomaly detection in high-pressure work situations. In doing so, we also aim to generate richer empirical data about the role of visual perception and cognition in complex decision-making environments.