Keywords

1 Introduction

Interest in monitoring attention in real-time is growing for use in simulation and training environments, both to learn more about the dynamics of fluid attention during real-world task performance and to monitor, assess and respond to changes in the performer’s state such as overload or distraction. The most common tools for such purposes include eye-tracking devices. Though effective, eye-tracking alone cannot account for covert shifts or lapses in attention that are not accompanied by an eye saccade. A neural method for measuring the relative power of steady state visual evoked potentials (ssVEPs) can detect covert attention shifts, offering a potential complimentary capability to traditional eye-tracking methods. ssVEPs have already been used to demonstrate brain-computer interface attention-based control of remote control vehicles [1, 2] yet these methods rely on the use of visual flickers that are distracting and tiring to view. Researchers and training courses use simulation to make experiments and learning environments as ecologically valid as possible and to gather data on the performer that would not otherwise be available. In the act of measuring the performer, the simulation instrumentation should affect the student or subject’s performance as little as possible. For this reason, though low-frequency ssVEPs are effective, they are impractical for use in training or simulation environments.

High frequency flicker stimuli, however, appear solid to the human eye meaning high-frequency ssVEPs may offer a viable solution for detecting covert attention shifts in simulation and training environments. The ssVEPs these high frequency stimuli generate are more difficult to detect and extract than low-frequency ssVEPs and it is not clear whether higher frequency signals are sensitive to covert attention shifts like their lower-frequency counterparts. For this reason, the present limited exploration examines whether high frequency visual flickers evoke reliable and detectable ssVEPs that are sensitive to attention shifts, so that we may develop the method for monitoring attention in high fidelity simulation and training environments.

2 Background

Steady State Visual Evoked Potentials (ssVEPs) are cortical oscillations stimulated by an external driving frequency such as a visual flickerFootnote 1; electroencephalogram (EEG) can detect the resulting potential evoked in the cerebral cortex. A 12 Hz visual flicker, for instance, will drive a 12 Hz steady-state visual evoked potential (ssVEP) in areas of the extrastriate visual cortex. Any flicker within the visual field should generate an ssVEP of some kind; however, the amplitude and gain of the signal is modulated by attention [3, 4]. When a person attends to the driving frequency, the amplitude of the ssVEP increases, and the amplitude drops as the person shifts attention away (see Fig. 1). Work in primates that record signals directly from active brain tissue suggest the attention-directing frontal eye fields within the frontal cortex direct the dorsal attention network [4, 5] to amplify incoming visual signals from parietal and occipital regions. This “top-down” attention control amplifies signals of interest even without directing the eyes towards the stimuli.

Fig. 1.
figure 1

Difference in amplitude between covertly attended (solid) and ignored (dashed) ssVEP 12 Hz signal. From Russell et al. [7]

Traditionally the difference between ignored and attended signals is quantified with the Attention Modulation Index (AMI) [8], where:

$$ AMI = \frac{{(RMS_{attend} - RMS_{ignore} )}}{{(RMS_{attend} + RMS_{ignore} )}} $$
(1)

and

$$ RMS = \sqrt {\frac{{(x_{1}^{2} + x_{2}^{2} + \ldots x_{n}^{2} )}}{n}} $$
(2)

For EEG measures, \( n = (sampling\;frequency \times time_{seconds} ) \) and \( x \) is voltage.

A low AMI indicates low engagement, whereas high AMI indicates high engagement relative to an unengaged baseline measure. This type of method has been used with great success to study attention in a variety of ways:

  • Mishra et al. used multiple simultaneous ssVEPs to demonstrate that it is not increased focus on task-relevant detail, but the enhanced suppression of potentially distracting stimuli, that underlies superior perceptual performance (speed and accuracy) of skilled video game players [8].

  • Over a series of studies, Keil and colleagues [912] have demonstrated that emotional stimuli capture attention to a greater degree and are processed preferentially compared to neutral stimuli.

  • Russell et al. [7] used this method to examine the simultaneous dynamics of top-down and bottom-up attention control changes in response to threat of shock, showing that increased top-down focus could not overcome the increased distractibility associated with acute anxiety.

Steady State Visual Evoked Potentials are robust enough for applied as well as research applications. For example, roboticists have used attention-modulated ssVEPs to control various kinds of remote vehicles via EEG-based brain-computer interfaces [1, 2].

Critically, because ssVEP amplitude and AMI are sensitive to covert attention shifts they are capable of detecting attention lapses (i.e., “zoning out”) and when a performer attends peripheral visual fields. Moreover, multiple flicker frequencies can generate corresponding ssVEPs simultaneously [3], offering the ability to distinguish between attention shifts among multiple data streams in complex visual scenes. For these reasons, ssVEPs in combination with traditional eye-tracking, offer a more comprehensive and complete method for understanding the complexities of multitasked, real world environments where attention is divided among many information streams across the visual field.

There are a few technical hurdles however that will have to be overcome before we can realize this kind of capability. In the present paper we discuss our efforts to address one of these challenges and examine the feasibility of using high-frequency, rather than low-frequency ssVEP-driving stimuli to monitor attention. Most research applications use frequencies within the alpha band (8–12 Hz) because those are the strongest and most predominant frequencies in cortical activity, with the highest signal-to-noise ratio [13]. They are also easier to measure in the time domain despite slight phase shifts that occur during visual processing. This poses a problem for applied use however, as frequencies in this range are very noticeable, highly distracting and strain the eyes. Because the purpose of using ssVEPs is to unobtrusively monitor attention habits, shifts and patterns, noticeable frequency flickers in the 8–12 Hz range are too disruptive and will erode the ecological validity of our high-fidelity simulation and training environments.

A potential solution to this challenge is to use frequencies that are above the range of human perception and the critical flicker frequency threshold (CFF) - or the point at which a human can no longer detect a flicker and perceives the flicker as an average of the oscillating stimuli [14] (a black and white flicker will, for instance, appear as solid gray above the CCF). The CCF is roughly 15 Hz depending on factors such as luminance and contrast [14], but under some circumstances humans can detect flicker at much higher frequencies. An unobtrusive, visually complex system would need to present the driving ssVEP stimulus at a frequency far higher than 15 Hz to minimize distraction and disruption. Garcia [15] examined ssVEP frequencies as high as 60 Hz for similar reasons, but because the power of the cortical oscillations in these higher frequency bands are far lower than those in the alpha band, the higher frequencies pose a signal-to-noise-ratio (SNR) challenge.

Here we examine frequencies higher than those reported by Garcia for two reasons. First, in a previous analysis of existing data, ssVEP harmonics of low-frequency ssVEPs (8.6 and 12 Hz) were strongest between 90–110 Hz and exhibited the same attention-related properties as their lower-frequency counterparts. Second, if high-frequency ssVEPs are sensitive to attention shifts, advanced signal processing methods may help us overcome the current SNR challenge. The purpose of this investigation is to explore whether high frequency flicker rates (above 70 Hz) drive ssVEPs that are sensitive to attention shifts to enable development of top-down attention monitoring tools for use in high fidelity training and simulation environments.

3 Methods

For purposes of developing an attention-monitoring system suitable for high fidelity training and simulation environments, we examined the ssVEP electroencephalographic data stimulated by two different frequencies (72 Hz, 100 Hz) during three different attentional states. This pilot exploration included repeated measurements from a single person in a dark room.

In natural environments an operator will shift attention fluidly between attending something he is looking at (foveating on), attending something in the periphery without looking directly at it, and ignoring information within the visual field. We examined the following three attentional states to determine the specificity of this method for distinguishing between them in naturalistic settings:

  • Foveate: The user is looking directly at, and attending to the driving frequency.

  • Attend: The user is attending to the driving frequency in his peripheral vision but his eyes are directed away from the driving frequency.

  • Ignore: The user’s eyes are directed away from (in the same location as “Attend”) and he is not paying attention to the driving frequency.

In all cases as long as the driving frequency is in the visual field it should generate a cortical ssVEP.

3.1 Display Stimuli

To drive the ssVEP we used Presentation software (version 18.1, Neurobehavioral Systems) to display an alternating black and white square checkerboard (8 squares per side, and each square is 32 × 32 pixels, for an overall size of 2.75 × 2.75 in.) on a gray background in the center of a high frequency monitor (BenQ XL2430T 24-in. gaming monitor). This monitor refreshes at a higher rate than traditional computer monitors to improve the appearance of fast-moving and highly detailed video games, and is similar to the level of fidelity expected in high-fidelity simulations. To test different frequency markers we changed the refresh rate of the screen to 72 Hz and 100 Hz refresh rates, and adjusted the Presentation software code to alternate the checkerboard at the corresponding frame-rate.

3.2 Electroencephalography (EEG) Recordings

We recorded two minutes of EEG data for each condition using Advanced Brain Monitoring Inc. (ABM) B-Alert X24 electroencephalogram with the qEEG Standard Medical Montage sensor strip which arranges electrodes according to the standard 10–20 system [16]. Data were collected at a 256 Hz-sampling rate. For future applications and for examining higher frequency data streams, we would use a system with a higher sampling frequency and that does not impose an on-line bandpass filter.

3.3 Data Processing

All data processing was performed in MATLAB (R2012a).

Filtering.

We exported all data into .mat files and processed only the raw data files from ABM. We isolated the frequencies of interest with a linear-phase 257 tap FIR filter, de-trended the data, and applied a Hamming window to suppress frequency sidelobes before performing fast-Fourier transform to quantify power in 0.10 Hz bins.

Averaging.

For each condition and frequency we averaged across electrodes closest to the extra-striate regions of the brain (O1, O1, POz, Pz, P3 and P4). These are also the electrode sites most consistent with Mishra et al. and Russell et al. [7, 8]. We then averaged the power from 72–72.9 Hz and 100–100.9 Hz for each band to estimate the relative power of each condition across the 1 Hz frequency bins of interest.

4 Results

Averaged data suggest that both the 72 and 100 Hz ssVEP exhibit attention-related modulation, although the pattern of relative strengths of the signals among the three attentional states was unexpected.

The Attend attentional state, in which the performer was not looking directly at the driving stimulus, generated the largest ssVEP, while the Foveate attentional state, in which the performer was looking directly at the stimulus, generated the lowest power. This is strikingly different compared to results observed in lower frequencies, where foveating on a driving flicker will generate the greatest ssVEP power, and ignoring the driving flicker will generate the lowest ssVEP power. From the averages of these data, it appears that ignoring the high-frequency stimuli in visual periphery generates a larger ssVEP signal compared to looking directly at, and attending to the stimulus (see Fig. 2).

Fig. 2.
figure 2

Relative power of each frequency in foveate, attend and ignore attentional conditions in 72 Hz and 100 Hz ssVEPs in one person. Error bars are the standard deviation for each 0.10 Hz bin averaged across the 1 Hz band of interest. The Y axes are scaled to the relative power of each frequency.

Also somewhat surprising, is that the power of the 100 Hz ssVEP was stronger relative to 72 Hz ssVEP. The power was small in both frequencies but the general trend is that the higher the frequency, the lower the power in neural activity beyond alpha band [13]. See Fig. 3 for a detail view of the spectral power difference between Attend and Ignore conditions at a single electrode site from 100–100.9 Hz.

Fig. 3.
figure 3

Detail of the relative spectral power from covertly attending (attend) and ignoring (ignore) the 100 Hz frequency signal at the POz electrode site between 100–101 Hz.

5 Discussion

We found that visual signals above the frequency range of human perception appear to be sensitive to endogenous and overt shifts in attention, though these limited results suggest the pattern may not be the same as is observed with lower frequencies.

The power measures for the ssVEPs were small in all conditions, meaning a fair amount of online processing will be needed to reliably extract the signal in real-time applications. The 100 Hz ssVEP exhibited greater power in the same amount of time compared to the 72 Hz ssVEP and the 100 Hz ssVEP exhibited a larger relative difference between the three conditions suggesting the 100 Hz may be a more reliable frequency range for attention monitoring.

Most surprising was that the strongest ssVEP appeared in both frequencies emerged in the covertly, rather than the overtly, attended condition (Attend). Typically, foveated signals generate larger ssVEPs, but these trends have been reported mainly in lower-frequencies. While the present exploration was limited in scope, the consistency of the pattern across both frequencies suggests the observation may not be coincidental. One possible explanation may lie within either the uneven distribution of rod and cone cells within the retina, or the relative sensitivity of the magno- and parvocellular pathways to contrast and motion. If so, the mechanisms behind heightened peripheral sensitivity to motion may also underlie our results. In normal daylight conditions the rods responsible for most peripheral vision are saturated, however these data were collected in a dark room. This too has practical implications for ssVEP measurement and suggests different strategies may be necessary for well-illuminated daytime simulations compared to darker nighttime, or theater-like environments. These observations warrant further exploration to determine the consistency and reliability of the observation under various visual eccentricities from center and lighting conditions.

5.1 Limitations and Next Steps

This limited analysis provided an indication that high frequency ssVEPs are attention sensitive, but the unexpected results suggest a full study is needed to examine the consistency and variance across individuals to determine if the effect observed in this exploration is reliable. The curious results warrant further investigation, in particular to test the classification accuracy (sensitivity and specificity) of high-frequency ssVEPs. Given the robust literature in ssVEP research, there are a number of additional considerations that developers should consider when using ssVEP in applied settings.

Individual Differences.

Analysis of high-frequency harmonics of low-frequency ssVEPs in a previously analyzed data set (from [7]) showed that there is high variability between individuals in terms of which electrodes exhibit the strongest AMI in response to the driving frequency. This is not surprising given slight individual differences in neuromorphology (e.g., cortical wrinkling) and other features that will affect the dipoles, summation of dipoles, and distortion of the neural signal between the cortex and the scalp electrodes. Those with trait anxiety exhibit phase-shifts in ssVEP entrainment [17]. Though this study included exploratory data from only one person, it holds that any system that intends to use ssVEP signals to monitor attention should account for such differences by either averaging across a series of electrodes where ssVEPs are usually strongest or by determining for each individual which electrodes detect the signals most reliably across the visual field. Though more labor intensive, the latter method directly addresses the SNR challenge, and reduces the number of electrodes needed for reliable ssVEP detection, simplifying subsequent monitoring sessions. Machine learning methods for developing personalized models to interpret complex neurophysiological signals (such as those discussed elsewhere in these Proceedings [18]) will be instrumental for accelerating this kind of personalized approach.

Changes in State.

Changes to the performer’s state, including perceptual workload and anxiety, can also affect ssVEPs in terms of magnitude and phase. Emotional stimuli can generate a larger ssVEPs compared to those associated with neutral stimuli [912]. Similarly, anticipatory anxiety increased the magnitude of a 12 Hz ssVEP during threat of shock [7]. Perceptual workload also decreased the neural response to signals presented in the central regions of a screen while, peripherally presented signals were unaffected by workload manipulations [19]. An applied system will have to account for these state-related effects on ssVEP strength to ensure the system responds specifically to attention rather than arousal.

Frequency-Specific Questions.

More than anything, this investigation has highlighted the need to examine the distinct properties of different frequencies used to generate ssVEPs. For instance, though ssVEPs in response to peripheral stimuli may be less sensitive to changes in perceptual load than those from centrally-presented stimuli, given the differences we observed with high frequency ssVEPs it is difficult to know without testing directly whether high frequency ssVEPs would show similar field eccentricity-related differences. There is also evidence that different frequencies tag distinct neural networks [20] posing both an opportunity to target neural networks of interest and potential challenge, as each of these networks may be sensitive to different state and trait-related variables.

Phase Shifts.

Beyond frequency-specific and individual differences, ssVEPs also exhibit phase shifts in the “fast pathway” above 15 Hz [21]. This poses an additional signal processing challenge for online collection. Others have found phase-shifts during initial entrainment associated with different stimuli and as a function of trait anxiety [17]. We did not examine phase shifts in this analysis, but future investigations should consider how signal-processing strategies should account for phase shifts in ssVEPs.

In summary, this exploration has demonstrated that high-frequency ssVEPs may offer the kind of attention-sensitivity necessary to detect covert attention shifts for which other methods cannot currently account. While here we have discussed evoked potentials only in the visual domain, steady state potentials also occur in the auditory (steady state auditory evoked potentials, or ssAEPs) and somatosensory (ssSEPs) domains. Monitoring limited auditory attention is similarly important for understanding attention in dynamic environments, yet there are currently no “ear-tracking” correlates to eye-tracking. ssAEPs may thus offer methods for monitoring auditory attention for use independently of, or in conjunction with, ssVEPs in high-fidelity simulation and training environments for more complete real-time and multi-modal attention monitoring.