Skip to main content
Log in

Classification of Eye Tracking Data in Visual Information Processing Tasks Using Convolutional Neural Networks and Feature Engineering

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Eye tracking technology has been adopted in numerous studies in the field of human–computer interaction (HCI) to understand visual and display-based information processing as well as the underlying cognitive processes employed by users when navigating a computer interface. Analyzing eye tracking data can also help identify interaction patterns with regard to salient regions of an information display. Deep learning technology is increasingly being used in the analysis of eye tracking data by allowing for the classification of large amounts of eye tracking results. In this paper, eye tracking data and convolutional neural networks (CNNs) were used to perform a classification task to predict three types of information presentation methods. As a first step, a number of data preprocessing and feature engineering approaches were applied to eye tracking data collected through a controlled visual information processing experiment. The resulting data were used as input for the comparison of four CNN models with different architectures. In this experiment, two CNN models were effective in classifying the information presentations with overall accuracy greater than 80%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

References

  1. Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS, Davis A, Dean J, Devin M, Ghemawat S, Goodfellow I, Harp A, Irving G, Isard M, Jia Y, Jozefowicz R, Kaiser L, Kudlur M, Levenberg J, Mane D, Monga R, Moore S, Murray D, Olah C, Schuster M, Shlens J, Steiner B, Sutskever I, Talwar K, Tucker P, Vanhoucke V, Vasudevan V, Viegas F, Vinyals O, Warden P, Wattenberg M, Wicke M, Yu Y, Zheng X. Tensorflow: large-scale machine learning on heterogeneous distributed systems. 2016.

  2. Alqahtani Y, Chakraborty J, McGuire M, Feng JH. Understanding visual information processing for American vs. Saudi Arabian users. In: Di Bucchianico G, editor. Advances in design for inclusion. Cham: Springer International Publishing; 2020. p. 229–38.

    Chapter  Google Scholar 

  3. Alqahtani Y, McGuire M, Chakraborty J, Feng JH. Understanding how ADHD affects visual information processing. In: Antona M, Stephanidis C, editors. Universal access in human-computer interaction. multimodality and assistive environments. Cham: Springer International Publishing; 2019. p. 23–31.

    Chapter  Google Scholar 

  4. Banker K, Bakkum P, Verch S, Garrett D, Hawkins T. MongoDB in action. 2nd ed. Shelter Island: Manning Publications Co.; 2016.

    Google Scholar 

  5. Benyon D, Innocent P, Murray D. System adaptivity and the modelling of stereotypes. In: Bullinger HJ, Shackel B, editors. Human-computer interaction-INTERACT ’87. Amsterdam: North-Holland; 1987. p. 245–53. https://doi.org/10.1016/B978-0-444-70304-0.50047-9.

    Chapter  Google Scholar 

  6. Chatfield K, Simonyan K, Vedaldi A, Zisserman A. Return of the devil in the details: delving deep into convolutional nets. In: BMVC 2014—proceedings of the British machine vision conference 2014; 2014. https://doi.org/10.5244/C.28.6.

  7. Chen Z, Fu H, Lo WL, Chi Z. Strabismus recognition using eye-tracking data and convolutional neural networks. J Healthc Eng. 2018;2018:1–9. https://doi.org/10.1155/2018/7692198.

    Article  Google Scholar 

  8. Chollet F. Building autoencoders in keras. 2016. https://blog.keras.io/building-autoencoders-in-keras.html. Accessed 15 Jun 2020.

  9. Chollet F. Deep learning with python. Shelter Island: Manning Publications Co.; 2017.

    Google Scholar 

  10. Chollet F. Xception: deep learning with depthwise separable convolutions; 2017. pp. 1800–1807. https://doi.org/10.1109/CVPR.2017.195.

  11. Chollet F, et al. Keras. 2015. https://github.com/fchollet/keras. Accessed 15 Jun 2020.

  12. Collette A. Python and HDF5. Sebastopol: O’Reilly; 2013.

    Google Scholar 

  13. Goodfellow I, Bengio Y, Courville A. Deep Learning. Cambridge: MIT Press; 2016. http://www.deeplearningbook.org. Accessed 1 Jul 2020.

  14. Google. Tensorflow. 2020. https://www.tensorflow.org/. Accessed 15 Jun 2020.

  15. Groen M, Noyes J. Using eye tracking to evaluate usability of user interfaces: is it warranted? IFAC Proc Vol. 2010;43(13):489–93. https://doi.org/10.3182/20100831-4-FR-2021.00086 11th IFAC/IFIP/IFORS/IEA Symposium on Analysis, Design, and Evaluation of Human-Machine Systems.

    Article  Google Scholar 

  16. Gullà F, Cavalieri L, Ceccacci S, Germani M, Bevilacqua R. Method to design adaptable and adaptive user interfaces. In: Stephanidis C, editor. HCI international 2015—posters’ extended abstracts. Cham: Springer International Publishing; 2015. p. 19–24.

    Chapter  Google Scholar 

  17. Han J, Kamber M, Pei J. Data mining concepts and techniques. 3rd ed. Waltham: Morgan Kaufmann; 2011.

    MATH  Google Scholar 

  18. Hardzeyeu V, Klefenz F, Schikowski P. Vision assistant: a human–computer interface based on adaptive eye-tracking; 2006. pp. 175–182.https://doi.org/10.2495/DN060171.

  19. He K, Gkioxari G, Dollár P, Girshick R. Mask r-cnn. In: 2017 IEEE international conference on computer vision (ICCV); 2017. pp. 2980–2988. https://doi.org/10.1109/ICCV.2017.322.

  20. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR); 2016. pp. 770–778. https://doi.org/10.1109/CVPR.2016.90.

  21. Hinton G, Li Deng, Yu D, Dahl G, Rahman Mohamed A, Jaitly N, Senior A, Vanhoucke V, Nguyen P, Sainath T, Kingsbury B. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. Sig Process Mag IEEE. 2012;29(6):82–97. https://doi.org/10.1109/MSP.2012.2205597.

    Article  Google Scholar 

  22. Hinton G, Osindero S, Teh YW. A fast learning algorithm for deep belief nets. Neural Comput. 2006;18(7):1527–54. https://doi.org/10.1162/neco.2006.18.7.1527.

    Article  MathSciNet  MATH  Google Scholar 

  23. Huang B, Chen R, Zhou Q, Xu W. Eye landmarks detection via weakly supervised learning. Pattern Recogn. 2020;98:107076. https://doi.org/10.1016/j.patcog.2019.107076.

    Article  Google Scholar 

  24. Hutt S, Mills C, White S, Donnelly P, Mello S. The eyes have it: Gaze-based detection of mind wandering during learning with an intelligent tutoring system. In: 9th international conference on educational data mining, 2016.

  25. Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Bach F, Blei D, editors. Proceedings of the 32nd International Conference on Machine Learning, Proceedings of Machine Learning Research. Lille: PMLR; vol. 37, 2015. pp. 448–456.

  26. James G, Witten D, Hastie T, Tibshirani R. An introduction to statistical learning: with applications in R. 1st ed. New York: Springer; 2013.

    Book  Google Scholar 

  27. Keras. Keras. 2020. https://keras.io/. Accessed 15 Jun 2020.

  28. Kim J, Kim B, Roy PP, Jeong D. Efficient facial expression recognition algorithm based on hierarchical deep neural network structure. IEEE Access. 2019;7:41273–85. https://doi.org/10.1109/ACCESS.2019.2907327.

    Article  Google Scholar 

  29. Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International joint conference on artificial intelligence, vol. 14; 1995.

  30. Krizhevsky A, Sutskever I, Hinton G. Imagenet classification with deep convolutional neural networks. Neural Inf Process Syst. 2012;25:1097–105. https://doi.org/10.1145/3065386.

    Article  Google Scholar 

  31. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–44. https://doi.org/10.1038/nature14539.

    Article  Google Scholar 

  32. Lecun Y, Boser B, Denker J, Henderson D, Howard R, Hubbard W, Jackel L. Handwritten digit recognition with a back-propagation network. Neural Inf Process Syst. 1989;2:396–404.

    Google Scholar 

  33. Lecun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86:2278–324. https://doi.org/10.1109/5.726791.

    Article  Google Scholar 

  34. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu CY, Berg A. Ssd: Single shot multibox detector. In: ECCV; 2016.

  35. Minar MR, Naher J. Recent advances in deep learning: an overview; 2018. https://doi.org/10.13140/RG.2.2.24831.10403.

  36. Minsky ML. Steps toward artificial intelligence. In: Proceedings of the institute of radio engineers. 2000;49.

  37. Naqshbandi K, Gedeon T, Abdulla U. Automatic clustering of eye gaze data for machine learning. 2016;001239–44. https://doi.org/10.1109/SMC.2016.7844411.

  38. Norcio AF, Stanley J. Adaptive human-computer interfaces: a literature survey and perspective. IEEE Trans Syst Man Cybern. 1989;19(2):399–408. https://doi.org/10.1109/21.31042.

    Article  Google Scholar 

  39. Park SJ, Kim BG. Development of low-cost vision-based eye tracking algorithm for information augmented interactive system. J Multimed Inf Syst. 2020;7:11–6. https://doi.org/10.33851/JMIS.2020.7.1.11.

    Article  Google Scholar 

  40. Pillow. Pillow. 2020. https://pillow.readthedocs.io/en/stable/. Accessed 15 Jun 2020.

  41. Poole A, Ball LJ. Eye tracking in hci and usability research. In: Ghaoui C, editor. Encyclopedia of human computer interaction; 2005. pp. 211–219.: Idea Group Reference.

  42. Rakhmatulin I, Duchowski AT. Deep neural networks for low-cost eye tracking. Proc Comput Sci. 2020;176:685–94. https://doi.org/10.1016/j.procs.2020.09.041 Knowledge-Based and Intelligent Information & Engineering Systems: Proceedings of the 24th International Conference KES2020.

    Article  Google Scholar 

  43. Redmon J, Divvala S, Girshick R, Farhadi A. You only look once: unified, real-time object detection. In: The IEEE conference on computer vision and pattern recognition (CVPR); 2016. pp. 779–788. https://doi.org/10.1109/CVPR.2016.91.

  44. Redmon J, Farhadi A. Yolov3: an incremental improvement; 2018. ArXiv:1804.02767.

  45. Riener A, Boll SC, Kun AL. Automotive user interfaces in the age of automation (dagstuhl seminar 16262). Dagstuhl Rep. 2016;6:111–59.

    Google Scholar 

  46. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L. ImageNet large scale visual recognition challenge. Int J Comput Vis (IJCV). 2015;115(3):211–52. https://doi.org/10.1007/s11263-015-0816-y.

    Article  MathSciNet  Google Scholar 

  47. Sainath TN, Peddinti V, Kingsbury B, Fousek P, Ramabhadran B, Nahamoo D. Deep scattering spectra with deep neural networks for lvcsr tasks. In: INTERSPEECH, 2014.

  48. Schall A, Bergstrom JR. Eye tracking in user experience design. Waltham: Elsevier Inc.; 2014.

    Google Scholar 

  49. Schmidhuber J. Deep learning in neural networks: an overview. Neural Netw. 2014;61:85–117. https://doi.org/10.1016/j.neunet.2014.09.003.

    Article  Google Scholar 

  50. Sifre L. Rigid-motion scattering for image classification. Ph.D. thesis 2014.

  51. Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations, 2015.

  52. Smith B. Beginning JSON. New York: Apress; 2015.

    Book  Google Scholar 

  53. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: The IEEE conference on computer vision and pattern recognition (CVPR); 2015. pp. 1–9. https://doi.org/10.1109/CVPR.2015.7298594.

  54. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR); 2016. pp. 2818–2826. Piscataway: IEEE. https://doi.org/10.1109/CVPR.2016.308.

  55. Ulahannan A, Jennings P, Oliveira L, Birrell S. Designing an adaptive interface: using eye tracking to classify how information usage changes over time in partially automated vehicles. IEEE Access. 2020;8:16865–75. https://doi.org/10.1109/ACCESS.2020.2966928.

    Article  Google Scholar 

  56. Xie S, Hu H. Facial expression recognition using hierarchical features with deep comprehensive multipatches aggregation convolutional neural networks. IEEE Trans Multimed. 2019;21(1):211–20. https://doi.org/10.1109/TMM.2018.2844085.

    Article  Google Scholar 

  57. Yin Y, Juan C, Chakraborty J, McGuire MP. Classification of eye tracking data using a convolutional neural network. In: 17th IEEE international conference on machine learning and applications (ICMLA); 2018. pp. 530–535. Orlando, FL: IEEE. https://doi.org/10.1109/ICMLA.2018.00085.

  58. Yoon HJ, Alamudun F, Hudson K, Morin-Ducote G, Tourassi G. Deep gaze velocity analysis during mammographic reading for biometric identification of radiologists. J Hum Perform Extrem Environ. 2018;. https://doi.org/10.7771/2327-2937.1088.

    Article  Google Scholar 

  59. Zemblys R, Niehorster D, Komogortsev O, Holmqvist K. Using machine learning to detect events in eye-tracking data. Behav Res Methods. 2017;. https://doi.org/10.3758/s13428-017-0860-3.

    Article  Google Scholar 

Download references

Acknowledgements

This research was partially supported by the Towson University School of Emerging Technologies. We also gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan V GPU used for this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuehan Yin.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Research involving human participants and/or animals

The study protocol was approved by the Towson University IRB (Exemption number: 14-X145).

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yin, Y., Alqahtani, Y., Feng, J.H. et al. Classification of Eye Tracking Data in Visual Information Processing Tasks Using Convolutional Neural Networks and Feature Engineering. SN COMPUT. SCI. 2, 59 (2021). https://doi.org/10.1007/s42979-020-00444-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-020-00444-0

Keywords

Navigation