Performance Evaluation of the Gazepoint GP3 Eye Tracking Device Based on Pupil Dilation

Mannaru, Pujitha; Balasingam, Balakumar; Pattipati, Krishna; Sibley, Ciara; Coyne, Joseph T.

doi:10.1007/978-3-319-58628-1_14

Pujitha Mannaru¹⁵,
Balakumar Balasingam¹⁵,
Krishna Pattipati¹⁵,
Ciara Sibley¹⁶ &
…
Joseph T. Coyne¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10284))

Included in the following conference series:

International Conference on Augmented Cognition

3595 Accesses
13 Citations

Abstract

Eye tracking is considered one of the most salient methods to study the cognitive demands of humans in human computer interactive systems, due to the unobtrusiveness, flexibility and the development of inexpensive eye trackers. In this work, we evaluate the applicability of these low cost eyetrackers to study pupillary response to varying memory loads and luminance conditions. Specifically, we examine a low-cost eye tracker, the Gazepoint GP3, and objectively evaluate its ability to differentiate pupil dilation metrics under different cognitive loads and luminance conditions. The classification performance is computed in the form of a receiver operating characteristic (ROC) curve and the results indicate that Gazepoint provides a reliable eye tracker to human computer interaction applications requiring pupil dilation studies.

You have full access to this open access chapter, Download conference paper PDF

Pupil detection for head-mounted eye tracking in the wild: an evaluation of the state of the art

Article 07 June 2016

Characterization of Eye Gaze and Pupil Diameter Measurements from Remote and Mobile Eye-Tracking Devices

Evaluating the data quality of the Gazepoint GP3 low-cost eye tracker when used independently by study participants

Article 27 November 2020

Keywords

1 Introduction

Eye tracking metrics are found to be useful indicators of visual attention and cognitive workload in numerous application areas, including reading and language comprehension [1], driving [2], individual differences [3], gaming devices [4], and medical applications [5]. Eye tracking devices (eye trackers) are used to collect measurements, such as pupil dilation, gaze locations and eye-closing patterns. Recent technical advances in video sensors and miniaturized computing power have resulted in cost-effective mass produced eye tracking devices; thus, several low-cost eye tracking devices have become available for researchers. However, the effectiveness of these low-cost devices to study human behavior remains an ongoing investigation [6,7,8,9,10,11,12,13] and is the objective of this paper. Specifically, we examine a low-cost eye tracker, the Gazepoint GP3 (cost $\approx $ $500), and objectively evaluate its ability to differentiate pupil dilation metrics under different cognitive loads and luminance conditions. To our knowledge, this is one of the first studies reporting the effectiveness of Gazepoint GP3 in capturing pupillary data.

Several pupillary metrics have been proposed in the past as useful indices of cognitive context [14,15,16]. Out of those, we employ two widely accepted metrics in this paper: one computed in the time domain and the other in the frequency domain. Using data collected by the Gazepoint GP3 eye tracking device, a time domain measure, task evoked pupillary response (TEPR) [17], as well as a recently published frequency domain measure, pupillary power spectral density (PSD) [18], are computed and evaluated as indicators of mental workload under different luminance conditions. It has been well established that pupil diameter is impacted by both mental workload and luminance conditions [19,20,21,22,23,24]. Therefore, the objective of our experiment is to verify the potential use of Gazepoint system to study the impact of these two factors on pupil diameter in studies involving cognitive context analysis.

Towards this end, we employed the digit span task [19] experiment under different luminance conditions, which is explained in Sect. 2. The rest of the paper is organized as follows: data collection and analysis methods are described in Sects. 2 and 3, respectively, the results of classification analysis are presented and discussed in Sect. 4, and the paper is concluded in Sect. 5.

2 Experiment

2.1 Subjects

Twenty participants ranging in age from 22 to 29 years ($M = 23.9, SD = 2.41$) voluntarily participated in the experiment conducted by researchers from the Naval Research laboratory (NRL) at the Naval Aerospace Medical Institute (NAMI).

2.2 Apparatus

All the eye tracking data were collected using the Gazepoint GP3 system. The system was calibrated for each user according to the Gazepoint Application Program Interface (API) manual [25]. GP3 collects the pupillary data, specifically, pupil size in pixels for each eye and their corresponding binary quality factor (valid/invalid) at 60 samples/s.

2.3 Task

A visual digit span task (also known as memory span task), which is a common technique used for assessing working memory capacity, was employed to assess the pupillary response of the participants to mental workload. In this task, participants are presented with a series of numbers and are then asked to recall the digits in the order they saw them. Longer series of numbers present more of a challenge for working memory, while shorter series are expected to be easier.

A luminance change task was employed to assess the pupillary response of the participants to the screen luminance. While completing the digit span task, participants were fixating on a monitor which varied in the background luminance (black, gray, and white).

2.4 Procedure

As mentioned in the previous section, participants engaged in a digit span task. Each participant was given four sets of digits of sizes 3, 5, 7 and 9 under three different screen luminance conditions (black, gray and white). The experiment utilized a within subject design (i.e., repeated measures) in which each participant completed all digit span set sizes (3, 5, 7 and 9, randomly ordered and exhaustive) three times for each of the 3 different background colors (white, gray, and black). Thus, a total of 36(= 4 set sizes $\times $ 3 colors $\times $ 3 times) trials were conducted. Participants were told to focus on a central fixation cross (a “+” sign $\sim $50 pixels tall and wide) that was offset from the background color (80 brighter for the black and gray backgrounds, and 80 darker for the white background). The string of numbers was then sequentially presented $\sim $1 s per number. Following each number set (e.g., “2, 6, 1, 8, 4”), a numeric keypad appeared on the screen and participants used the mouse to input the string of numbers (“2, 6, 1, 8, 4”) by clicking on the corresponding numbers in order. The keypad was used to ensure that participants continued to fixate on the screen, while they were making a response. When satisfied, the participants clicked the submit button. Participants were not given performance feedback on their response accuracy. Following each set of digits, there was a pause of $\approx $3 s before presenting the participant with a numeric keypad on the monitor to enter his/her response. The pupillary measures from this time segment, known as the encoding phase of the memory, are analyzed here. The total time to complete the digit span task varied from 10–15 min, depending on the participant’s response times.

3 Data Analysis

The Gazepoint GP3 collects the following pupillary data: pupil size in pixels for each eye and their corresponding binary quality factors (valid/invalid) at 60 samples/s, the scale factor of each eye pupil (unitless), whose value equals 1 at calibration depth, is less than 1 when the user is closer to the eye tracker and greater than 1 when the user is further away. Only data from the encoding time segment are analyzed in this work, as it has been established by the human factors researchers that the maximum pupil dilation occurs during the encoding of the stimulus materials for short term memory recall tasks [26, 27].

3.1 Data Preprocessing

For time-domain analysis (TEPR), the poor quality samples (quality factor = 0) of the pupil size signals were marked as missing values (or NaN in MATLAB^® [28]). Pupil size data of the eye with fewer missing observations [29] were utilized for analysis. A “clean-up” function was employed to remove all the data below 4th percentile and above 98th percentile, in order to remove any sudden dips/peaks in the pupil size signal. Then, a hampel filter (of order 6) [30] was applied to remove outliers and a linear interpolator was used to recover missing values. Figure 1a shows an example of raw data and filtered data signals.

For frequency-domain analysis (PSD), the linear trend in the above preprocessed signals was removed using the detrend function in MATLAB^® and the resulting signals were passed through a zero-phase lowpass butterworth filter with a cutoff frequency $f_c = 4$ Hz using the filtfilt function, since most of the pupillary activity falls in the frequency range of 0–4 Hz [31]. Figure 1b shows an example of detrended data and filtered data signals.

3.2 Data Analysis

Task Evoked Pupillary Response (TEPR): To evaluate the ability of the eye tracker in capturing the changes in pupil diameter caused by mental workload changes, we analyzed the data of set sizes 3 (labeled as EASY), 5 (labeled as MEDIUM) and 7 (labeled as HARD) only. The set size 9 was excluded from the analysis since recall performance dropped to 65% (i.e., only remembering 65% of the 9 numbers) and there was increased variability between participants, suggesting it was either too difficult for some participants or that some participants gave up. For classification purposes, the median values of the pupil size in the encoding phase (TEPR), for each person, for each set size, each background color, and for each trial, (e.g., pupil size of person 13, set size 3 in a black background for the first trial) were computed over a sliding window of size 30 samples with an overlap of 25 samples ($\approx $80% overlap). A simple cut-point grouping into binary classes was implemented for pairs of set sizes 3 (EASY) vs. 7 (HARD), 3 (EASY) vs 5 (MEDIUM) and 5 (MEDIUM) vs. 7 (HARD) for the corresponding pairs of the moving-median filtered signals. The Receiver Operating Characteristic (ROC) curves [32] were drawn by varying the cut-points from the minimum of the two signals, in steps of 0.01 pixels, to the maximum value of the two signals.

Power Spectral Density (PSD): PSD of the pupil diameter signals was computed for each person using the Welch’s method with segments of 50 samples with 50% overlap [18]. Each segment was windowed with a Hamming window. Only the ‘encoding’ phase was considered when computing PSD under the memory tasks of set size 3 (EASY) vs set size 5 (MEDIUM) vs. set size 7 (HARD). PSD presented here is the average PSD over 20 participants * 3 trials; thus averaged over a total of 60 trials for each background luminance color.

4 Results and Discussion

At the preprocessing stage, an average of 37% data was missing due to poor quality recordings. Figure 2 shows the boxplots for average pupil diameters across different background luminance conditions and workload conditions. It is evident that the average pupil diameter in a black background is higher than that of the grey background which, in turn, is greater than that of the white background; this pattern agrees with earlier pupillary light reflex studies, thereby assuring the GP3’s capability to capture light-sensitive pupillary readings. Figure 2 also shows the differences in average pupil diameter for different workload tasks within the same background conditions and it can be seen that the average pupil diameter for set size 3 is lower than that of set size 7 under all 3 luminance conditions. However, the pupil diameters of set size 5 is not clearly greater than (or lesser than) for set size 3 (or for set size 7) under black and grey background luminance conditions.

To further analyze the differences in TEPRs corresponding to the different set sizes, we plotted the ROC curves from classification as described in Sect. 3. An example set of ROC curves for one person are shown in Figs. 3, 4 and 5. For this particular example, Fig. 3 shows a 100% accuracy in classifying pupil size signals of set size 3 vs. 7 for all three background conditions, whereas a 68% accuracy in classifying pupil size signals of set size 3 vs. 5 in grey background conditions and a 78% accuracy in classifying pupil size signals of set size 5 vs. 7 in white background conditions. Table 1 gives the average classification accuracy values over all participants and over all 3 repeated trials. Therefore, the minimum average classification accuracy is approximately 80%, which is considered a significant value by psychologists in detecting human cognitive context.

Table 1. Average accuracies in TEPR classification

Full size table

Figure 6 shows the results of PSD analysis, where Figs. 6(a–c) correspond to black, grey and white background conditions, respectively. The results agree with earlier studies only in the average power spectral densities of set size 3 vs. set size 5 or 7. However, the results we obtained do not conform to the finding that average PSD increases in the frequency range of 0.1–0.5 Hz and 1.6–3.5 Hz with increase in cognitive workload as the average PSD in set size 5 is seen to be greater than that of set size 7. This could be due to the recovery of lost data points by using a linear interpolator or due to similar spectral behavior of pupils during set sizes 5 and 7. Also, to our knowledge, there is no detailed mechanism for this phenomena of pupil control and PSD, yet. Future research will integrate the PSD metrics in classification studies to attempt to validate the findings of Peysakhovich et al. [18] and Nakayama and Shimizu [31].

5 Summary and Conclusion

In this paper, we evaluated the performance of Gazepoint GP3, a low-cost eye tracker, by using pupillary metrics that are already tested and used by human factors researchers: TEPRs and PSD. We collected pupil size data from 20 volunteers engaged in a visual digit span task. First, a preprocessing routine was employed to filter out outliers from the data for time domain analysis, and low pass filtering was performed prior to frequency domain analysis. Then, TEPRs and PSDs were computed and studied for different digit set sizes. The classification performance is computed in the form of a receiver operating characteristic (ROC) curve and the results show the applicability and limitations of low-cost eye tracking devices by cognitive workload researchers.

The results indicate that the Gazepoint GP3 is an easy and inexpensive tool that can be utilized in psychological studies involving pupil diameter data. The classification results indicate that the eye tracker does a good job in classifying mental workloads under different background luminance conditions; however, it is not a robust tool for frequency domain analysis which could be attributable to linear interpolation of poor quality readings. Researchers, with budget constraints, who are interested in incorporating pupillary measures of cognitive workload now have access to a reliable inexpensive eye tracker. However, they should keep in mind the GP3 is limited to collecting pupil diameter data for tasks which use a single screen and is vulnerable to loss of chunks of data. Finally, we believe that the low cost eyetrackers are of great value to researchers from all disciplines trying to incorporate human factors aspects in their systems.

References

Just, M.A., Carpenter, P.A.: The intensity dimension of thought: pupillometric indices of sentence processing. Can. J. Exp. Psychol./Revue canadienne de psychologie expérimentale 47(2), 310 (1993)
Article Google Scholar
Palinko, O., Kun, A.L., Shyrokov, A., Heeman, P.: Estimating cognitive load using remote eye tracking in a driving simulator. In: Proceedings of 2010 Symposium on Eye-Tracking Research and Applications, pp. 141–144. ACM (2010)
Google Scholar
Odenheimer, G., Funkenstein, H., Beckett, L., Chown, M., Pilgrim, D., Evans, D., Albert, M.: Comparison of neurologic changes in’successfully aging’persons vs the total aging population. Arch. Neurol. 51(6), 573–580 (1994)
Article Google Scholar
Ekman, I.M., Poikola, A.W., Mäkäräinen, M.K.: Invisible eni: using gaze and pupil size to control a game. In: CHI 2008 Extended Abstracts on Human Factors in Computing Systems, pp. 3135–3140. ACM (2008)
Google Scholar
Ren, P., Barreto, A., Gao, Y., Adjouadi, M.: Affective assessment of computer users based on processing the pupil diameter signal. In: Engineering in Medicine and Biology Society, EMBC, 2011 Annual International Conference of the IEEE, pp. 2594–2597. IEEE (2011)
Google Scholar
Zugal, S., Pinggera, J.: Low–cost eye–trackers: useful for information systems research? In: Iliadis, L., Papazoglou, M., Pohl, K. (eds.) CAiSE 2014. LNBIP, vol. 178, pp. 159–170. Springer, Cham (2014). doi:10.1007/978-3-319-07869-4_14
Google Scholar
Dalmaijer, E.: Is the low-cost eyetribe eye tracker any good for research? Technical report, PeerJ PrePrints (2014)
Google Scholar
Ooms, K., Dupont, L., Lapon, L., Popelka, S.: Accuracy and precision of fixation locations recorded with the low-cost eye tribe tracker in different experimental setups. J. Eye Mov. Res. 8(1) (2015)
Google Scholar
Ferhat, O., Vilariño, F.: Low cost eye tracking. Comput. Intell. Neurosci. 2016, 17 (2016)
Article Google Scholar
Coyne, J., Sibley, C.: Investigating the use of two low cost eye tracking systems for detecting pupillary response to changes in mental workload. In: Proceedings of Human Factors and Ergonomics Society Annual Meeting, vol. 60, pp. 37–41. SAGE Publications (2016)
Google Scholar
Funke, G., Greenlee, E., Carter, M., Dukes, A., Brown, R., Menke, L.: Which eye tracker is right for your research? Performance evaluation of several cost variant eye trackers. In: Proceedings of Human Factors and Ergonomics Society Annual Meeting, vol. 60, pp. 1240–1244. SAGE Publications (2016)
Google Scholar
Gibaldi, A., Vanegas, M., Bex, P.J., Maiello, G.: Evaluation of the Tobii EyeX eye tracking controller and Matlab toolkit for research. Behav. Res. Methods 1–24 (2016)
Google Scholar
Janthanasub, V., Meesad, P.: Evaluation of a low-cost eye tracking system for computer input. King Mongkuts Univ. Technol. North Bangk. Int. J. Appl. Sci. Technol. 8(3), 185–196 (2015)
Google Scholar
Causse, M., Sénard, J.-M., Démonet, J.F., Pastor, J.: Monitoring cognitive and emotional processes through pupil and cardiac response during dynamic versus logical task. Appl. Psychophysiol. Biofeedback 35(2), 115–123 (2010)
Article Google Scholar
Mannaru, P., Balasingam, B., Pattipati, K., Sibley, C., Coyne, J.: Cognitive context detection in UAS operators using pupillary measurements. In: SPIE Defense+ Security, p. 98510Q. International Society for Optics and Photonics (2016)
Google Scholar
Mandrick, K., Peysakhovich, V., Rémy, F., Lepron, E., Causse, M.: Neural and psychophysiological correlates of human performance under stress and high mental workload. Biol. Psychol. 121, 62–73 (2016)
Article Google Scholar
Beatty, J.: Task-evoked pupillary responses, processing load, and the structure of processing resources. Psychol. Bull. 91(2), 276 (1982)
Article Google Scholar
Peysakhovich, V., Causse, M., Scannella, S., Dehais, F.: Frequency analysis of a task-evoked pupillary response: luminance-independent measure of mental effort. Int. J. Psychophysiol. 97(1), 30–37 (2015)
Article Google Scholar
Kahneman, D., Beatty, J.: Pupil diameter and load on memory. Science 154(3756), 1583–1585 (1966)
Article Google Scholar
Tryon, W.W.: Pupillometry: a survey of sources of variation. Psychophysiology 12(1), 90–93 (1975)
Article Google Scholar
Taptagaporn, S., Saito, S.: How display polarity and lighting conditions affect the pupil size of VDT operators. Ergonomics 33(2), 201–208 (1990)
Article Google Scholar
Goldinger, S.D., Papesh, M.H.: Pupil dilation reflects the creation and retrieval of memories. Curr. Dir. Psychol. Sci. 21(2), 90–95 (2012)
Article Google Scholar
Winn, B., Whitaker, D., Elliott, D.B., Phillips, N.J.: Factors affecting light-adapted pupil size in normal human subjects. Investig. Ophthalmol. Vis. Sci. 35(3), 1132–1137 (1994)
Google Scholar
Peysakhovich, V., Vachon, F., Dehais, F.: The impact of luminance on tonic and phasic pupillary responses to sustained cognitive load. Int. J. Psychophysiol. 112, 40–45 (2017)
Article Google Scholar
Gazepoint API. http://www.gazept.com/dl/Gazepoint_API_v2.0.pdf
Beatty, J., Lucero-Wagoner, B.: The pupillary system. Handb. Psychophysiol. 2, 142–162 (2000)
Google Scholar
Gardner, R.M., Beltramo, J.S., Krinsky, R.: Pupillary changes during encoding, storage, and retrieval of information. Percept. Mot. Skills 41(3), 951–955 (1975)
Article Google Scholar
MATLAB: R2016a. The MathWorks Inc., Natick (2016)
Google Scholar
Papesh, M.H., Goldinger, S.D., Hout, M.C.: Memory strength and specificity revealed by pupillometry. Int. J. Psychophysiol. 83(1), 56–64 (2012)
Article Google Scholar
Pearson, R.K.: Outliers in process modeling and identification. IEEE Trans. Control Syst. Technol. 10(1), 55–63 (2002)
Article Google Scholar
Nakayama, M., Shimizu, Y.: Frequency analysis of task evoked pupillary response and eye-movement. In: Proceedings of 2004 Symposium on Eye Tracking Research Applications, pp. 71–76. ACM (2004)
Google Scholar
Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006)
Article MathSciNet Google Scholar

Download references

Acknowledgements

The authors would like to thank Dr. Jeffrey Morrison and the Command Decision Making (CDM) program at the U.S. Office of Naval Research and Department of Defense High Performance Computing Modernization Program for supporting this work. In addition, the authors would like to thank the symposium organizers for their encouragement of this work. This research was funded by the U.S. Office of Naval Research and the Department of Defense under contracts #N00014-12-1-0238, #N00014-16-1-2036 and #HPCM034125HQU.

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, University of Connecticut, Storrs, CT, USA
Pujitha Mannaru, Balakumar Balasingam & Krishna Pattipati
Warfighter Human Systems Integration Lab, Naval Research Laboratory, Washington DC, USA
Ciara Sibley & Joseph T. Coyne

Authors

Pujitha Mannaru
View author publications
You can also search for this author in PubMed Google Scholar
Balakumar Balasingam
View author publications
You can also search for this author in PubMed Google Scholar
Krishna Pattipati
View author publications
You can also search for this author in PubMed Google Scholar
Ciara Sibley
View author publications
You can also search for this author in PubMed Google Scholar
Joseph T. Coyne
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Pujitha Mannaru .

Editor information

Editors and Affiliations

SoarTech, Orlando, Florida, USA
Dylan D. Schmorrow
Design Interactive, Inc. , Orlando, Florida, USA
Cali M. Fidopiastis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mannaru, P., Balasingam, B., Pattipati, K., Sibley, C., Coyne, J.T. (2017). Performance Evaluation of the Gazepoint GP3 Eye Tracking Device Based on Pupil Dilation. In: Schmorrow, D., Fidopiastis, C. (eds) Augmented Cognition. Neurocognition and Machine Learning. AC 2017. Lecture Notes in Computer Science(), vol 10284. Springer, Cham. https://doi.org/10.1007/978-3-319-58628-1_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-58628-1_14
Published: 18 May 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-58627-4
Online ISBN: 978-3-319-58628-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics