Mid-level feature contributions to category-specific gaze guidance
Our research has previously shown that scene categories can be predicted from observers’ eye movements when they view photographs of real-world scenes. The time course of category predictions reveals the differential influences of bottom-up and top-down information. Here we used these known differences to determine to what extent image features at different representational levels contribute toward guiding gaze in a category-specific manner. Participants viewed grayscale photographs and line drawings of real-world scenes while their gaze was tracked. Scene categories could be predicted from fixation density at all times over a 2-s time course in both photographs and line drawings. We replicated the shape of the prediction curve found previously, with an initial steep decrease in prediction accuracy from 300 to 500 ms, representing the contribution of bottom-up information, followed by a steady increase, representing top-down knowledge of category-specific information. We then computed the low-level features (luminance contrasts and orientation statistics), mid-level features (local symmetry and contour junctions), and Deep Gaze II output from the images, and used that information as a reference in our category predictions in order to assess their respective contributions to category-specific guidance of gaze. We observed that, as expected, low-level salience contributes mostly to the initial bottom-up peak of gaze guidance. Conversely, the mid-level features that describe scene structure (i.e., local symmetry and junctions) split their contributions between bottom-up and top-down attentional guidance, with symmetry contributing to both bottom-up and top-down guidance, while junctions play a more prominent role in the top-down guidance of gaze.
KeywordsEye movements Visual attention Scene perception
- Koch, C., & Ullman, S. (1985). Shifts in selective visual-attention—Towards the underlying neural circuitry. Human Neurobiology, 4, 219–227.Google Scholar
- Koffka, K. (1935). Principles of Gestalt psychology (International Library of Psychology, Philosophy and Scientific Method). New York: Harcourt, Brace & World.Google Scholar
- Kümmerer, M., Wallis, T. S., & Bethge, M. (2016). DeepGaze II: Reading fixations from deep features trained on object recognition. arXiv preprint. arXiv:1610.01563Google Scholar
- Wagemans, J., Elder, J. H., Kubovy, M., Palmer, S. E., Peterson, M. A., Singh, M., & von der Heydt, R. (2012). A century of Gestalt psychology in visual perception: I. Perceptual grouping and figure–ground organization. Psychological Bulletin, 138, 1172. https://doi.org/10.1037/a0029333 CrossRefGoogle Scholar
- Wilder, J. D., Rezanejad, M., Dickinson, S., Jepson, A., Siddiqi, K., & Walther, D. B. (2017a). The role of symmetry in scene categorization by human observers, Paper presented at the Computational and Mathematical Models in Vision Conference, St. Pete Beach.Google Scholar