Abstract
Human scanpath prediction aims to use computational models to mimic human gaze shifts under free view conditions. Previous works utilizing low-level features, hand-crafted high-level features, saccadic amplitude, memory bias cannot fully explain the mechanism of visual attention. In this paper, we propose a comprehensive method to predict scanpath from four aspects: low-level features, saccadic amplitude, semantic features learned via deep convolutional neural network, memory bias including short-term and long-term memory. By calculating the probabilities for all candidate regions in an image, the position of next fixation point can be selected via picking the one with the largest probability product. Moreover, fixation duration as a key factor is first used to model memory effect on scanpath prediction. Experiments on two public datasets demonstrate the effectiveness of the proposed method, and comparisons with state-of-the-art methods further validate the superiority of our method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904 (2014)
Galgani, F., Sun, Y., Lanzi, P.L., Leigh, J.: Automatic analysis of eye tracking data for medical diagnosis. In: IEEE Symposium on Computational Intelligence and Data Mining, 2009, CIDM 2009, pp. 195–202. IEEE (2009)
Higgins, E., Leinenger, M., Rayner, K.: Eye movements when viewing advertisements. Front. Psychol. 5(5), 210 (2014)
Lohse, G.L., Wu, D.J.: Eye movement patterns on Chinese yellow pages advertising. Electron. Mark. 11(2), 87–96 (2001)
Santella, A., Agrawala, M., Decarlo, D., Salesin, D., Cohen, M.: Gaze-based interaction for semi-automatic photo cropping. In: Conference on Human Factors in Computing Systems, CHI 2006, Montral, Qubec, Canada, pp. 771–780. DBLP, April 2006
Harding, G., Bloj, M.: Real and predicted influence of image manipulations on eye movements during scene recognition. J. Vis. 10(2), 8.1 (2010)
Le, M.O., Liu, Z.: Saccadic model of eye movements for free-viewing condition. Vision Research 116(Pt B), 152 (2015)
Liu, H., Xu, D., Huang, Q., Li, W., Xu, M., Lin, S.: Semantically-Based Human Scanpath Estimation with HMMs (2013)
Keech, T.D., Resca, L.: Eye movement trajectories in active visual search: contributions of attention, memory, and scene boundaries to pattern formation. Atten. Percept. Psychophys. 72(1), 114–41 (2010)
Becker, M.W., Rasmussen, I.P.: Guidance of attention to objects and locations by long-term memory of natural scenes. J. Exp. Psychol. Learn. Mem. Cogn. 34(6), 1325 (2008)
Kliegl, R., Nuthmann, A., Engbert, R.: Tracking the mind during reading: the influence of past, present, and future words on fixation durations. J. Exp. Psychol. Gen. 135(1), 12 (2006)
Alvarez, G.A., Cavanagh, P.: The capacity of visual short term memory is set both by visual information load and by number of objects. Psychol. Sci. 15(2), 106–111 (2004)
Liu, M.Y., Tuzel, O., Ramalingam, S., Chellappa, R.: Entropy rate superpixel segmentation. In: Computer Vision and Pattern Recognition, vol. 32, pp. 2097–2104. IEEE (2011)
Wang, W., Chen, C., Wang, Y., Jiang, T., Fang, F., Yao, Y.: Simulating human saccadic scanpaths on natural images. In: Computer Vision and Pattern Recognition, vol. 42, pp. 441–448. IEEE (2011)
Lee, G., Tai, Y.W., Kim, J.: Deep Saliency with Encoded Low Level Distance Map and High Level Features, pp. 660–668 (2016)
Xu, J., Jiang, M., Wang, S., Kankanhalli, M.S., Zhao, Q.: Predicting human gaze beyond pixels. J. Vis. 14(1), 97–97 (2014)
Jiang, M., Boix, X., Roig, G., Xu, J., Gool, L.V., Zhao, Q.: Learning to predict sequences of human visual fixations. IEEE Trans. Neural Netw. Learn. Syst. 27(6), 1241 (2016)
Ramanathan, S., Katti, H., Sebe, N., Kankanhalli, M., Chua, T.-S.: An eye fixation database for saliency detection in images. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 30–43. Springer, Heidelberg (2010). doi:10.1007/978-3-642-15561-1_3
Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: IEEE International Conference on Computer Vision, vol. 30, pp. 2106–2113. IEEE (2010)
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Shao, X., Luo, Y., Zhu, D., Li, S., Itti, L., Lu, J. (2017). Scanpath Prediction Based on High-Level Features and Memory Bias. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, ES. (eds) Neural Information Processing. ICONIP 2017. Lecture Notes in Computer Science(), vol 10636. Springer, Cham. https://doi.org/10.1007/978-3-319-70090-8_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-70090-8_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70089-2
Online ISBN: 978-3-319-70090-8
eBook Packages: Computer ScienceComputer Science (R0)