Navigational Affordance Cortical Responses Explained by Scene-Parsing Model

  • Kshitij Dwivedi
  • Gemma RoigEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11131)


Deep Neural Networks (DNNs) are the leading models for explaining the population responses of neurons in the visual cortex. Recent studies show that responses of some task-specific brain regions can also be explained by a DNN trained for classification. In this work, we propose that responses of task-specific brain regions are better explained by DNNs trained on a similar task. We first show that responses of scene selective visual areas like parahippocampal place area (PPA) and Occipital Place Area (OPA) are better explained by a DNN trained for scene classification than one trained for object classification. Next, we consider a particular case of OPA which has been shown to encode navigational affordances. We argue that a scene parsing task, which predicts the class of each pixel in the scene is more related to navigational affordances than scene classification. Our results show that the responses in OPA are better explained by the scene parsing model than the scene classification model.


Deep Neural Networks Representational similarity analysis Occipital Place Area Neural Encoding 



This work was funded by the MOE SUTD SRG grant (SRG ISTD 2017 131). Kshitij Dwivedi was also funded by SUTD President’s Graduate Fellowship.


  1. 1.
    Bonner, M.F., Epstein, R.A.: Computational mechanisms underlying cortical responses to the affordance properties of visual scenes. PLOS Comput. Biol. 14, e1006111 (2018). Scholar
  2. 2.
    Bonner, M.F., Epstein, R.A.: Coding of navigational affordances in the human visual system. Proc. Nat. Acad. Sci. 114(18), 4793–4798 (2017)CrossRefGoogle Scholar
  3. 3.
    Cichy, R.M., Khosla, A., Pantazis, D., Torralba, A., Oliva, A.: Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence. Sci. Rep. 6(June), 1–13 (2016). Scholar
  4. 4.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2009, pp. 248–255. IEEE (2009)Google Scholar
  5. 5.
    Dilks, D.D., Julian, J.B., Paunov, A.M., Kanwisher, N.: The occipital place area is causally and selectively involved in scene perception. J. Neurosci. 33(4), 1331–1336 (2013)CrossRefGoogle Scholar
  6. 6.
    Epstein, R., Harris, A., Stanley, D., Kanwisher, N.: The parahippocampal place area: recognition, navigation, or encoding? Neuron 23(1), 115–125 (1999)CrossRefGoogle Scholar
  7. 7.
    Horikawa, T., Kamitani, Y.: Generic decoding of seen and imagined objects using hierarchical visual features. Nature Commun. 8, 15037 (2017)CrossRefGoogle Scholar
  8. 8.
    Khaligh-Razavi, S.M., Kriegeskorte, N.: Deep supervised, but not unsupervised, models may explain IT cortical representation. PLoS Comput. Biol. 10(11), e1003915 (2014). Scholar
  9. 9.
    Kriegeskorte, N., Mur, M., Bandettini, P.A.: Representational similarity analysis-connecting the branches of systems neuroscience. Front. Syst. Neurosci. 2, 4 (2008)CrossRefGoogle Scholar
  10. 10.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  11. 11.
    Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)Google Scholar
  12. 12.
    Martin Cichy, R., Khosla, A., Pantazis, D., Oliva, A.: Dynamics of scene representations in the human brain revealed by magnetoencephalography and deep neural networks. NeuroImage 153, 346–358 (2017). Scholar
  13. 13.
    Nili, H., Wingfield, C., Walther, A., Su, L., Marslen-Wilson, W., Kriegeskorte, N.: A toolbox for representational similarity analysis. PLoS Comput. Biol. 10(4), e1003553 (2014)CrossRefGoogle Scholar
  14. 14.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
  15. 15.
    Tacchetti, A., Isik, L., Poggio, T.: Invariant recognition drives neural representations of action sequences, pp. 1–23 (2016). Scholar
  16. 16.
    Yamins, D.L.K., Hong, H., Cadieu, C.F., Solomon, E.A., Seibert, D., DiCarlo, J.J.: Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proc. Nat. Acad. Sci. 111(23), 8619–8624 (2014). Scholar
  17. 17.
    Yamins, D.L., DiCarlo, J.J.: Using goal-driven deep learning models to understand sensory cortex. Nat. Neurosci. 19(3), 356 (2016)CrossRefGoogle Scholar
  18. 18.
    Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40, 1452–1464 (2017)CrossRefGoogle Scholar
  19. 19.
    Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., Torralba, A.: Semantic understanding of scenes through the ADE20K dataset. arXiv preprint arXiv:1608.05442 (2016)
  20. 20.
    Zhou, B., Zhao, H., Puig, X., Fidler, S., Barriuso, A., Torralba, A.: Scene parsing through ADE20K dataset. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 633–641 (2017)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Singapore University of Technology and DesignSingaporeSingapore

Personalised recommendations