Abstract
Here we show that reproducing the functional properties of MT cells with various center–surround interactions enriches motion representation and improves the action recognition performance. To do so, we propose a simplified bio–inspired model of the motion pathway in primates: It is a feedforward model restricted to V1-MT cortical layers, cortical cells cover the visual space with a foveated structure and, more importantly, we reproduce some of the richness of center-surround interactions of MT cells. Interestingly, as observed in neurophysiology, our MT cells not only behave like simple velocity detectors, but also respond to several kinds of motion contrasts. Results show that this diversity of motion representation at the MT level is a major advantage for an action recognition task. Defining motion maps as our feature vectors, we used a standard classification method on the Weizmann database: We obtained an average recognition rate of 98.9%, which is superior to the recent results by Jhuang et al. (2007). These promising results encourage us to further develop bio–inspired models incorporating other brain mechanisms and cortical layers in order to deal with more complex videos.
Chapter PDF
References
Gavrila, D.: The visual analysis of human movement: A survey. Computer Vision and Image Understanding 73(1), 82–98 (1999)
Goncalves, L., DiBernardo, E., Ursella, E., Perona, P.: Monocular tracking of the human arm in 3D. In: Proceedings of the 5th International Conference on Computer Vision, June 1995, pp. 764–770 (1995)
Mokhber, A., Achard, C., Milgram, M.: Recognition of human behavior by space-time silhouette characterization. Pattern Recognition Letters 29(1), 81–89 (2008)
Seitz, S., Dyer, C.: View-invariant analysis of cyclic motion. The International Journal of Computer Vision 25(3), 231–251 (1997)
Collins, R., Gross, R., Shi, J.: Silhouette-based human identification from body shape and gait. In: 5th Intl. Conf. on Automatic Face and Gesture Recognition, p. 366 (2002)
Zelnik-Manor, L., Irani, M.: Event-based analysis of video. In: Proceedings of CVPR 2001, vol. 2, pp. 123–128 (2001)
Efros, A., Berg, A., Mori, G., Malik, J.: Recognizing action at a distance. In: Proceedings of the 9th International Conference on Computer Vision, vol. 2, pp. 726–734 (October 2003)
Laptev, I., Capuo, B., Schultz, C., Lindeberg, T.: Local velocity-adapted motion events for spatio-temporal recognition. Computer Vision and Image Understanding 108(3), 207–229 (2007)
Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: VS-PETS, pp. 65–72 (2005)
Michels, L., Lappe, M., Vaina, L.: Visual areas involved in the perception of human movement from dynamic analysis. Brain Imaging 16(10), 1037–1041 (2005)
Niebles, J.C., Wang, H., Fei-Fei, L.: Unsupervised learning of human action categories using spatial–temporal words. Internation Journal of Computer Vision 79(3), 299–318 (2008)
Wong, S.F., Kim, T.K., Cipolla, R.: Learning motion categories using both semantic and structural information. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 1–6 (June 2007)
Giese, M., Poggio, T.: Neural mechanisms for the recognition of biological movements and actions. Nature Reviews Neuroscience 4, 179–192 (2003)
Jhuang, H., Serre, T., Wolf, L., Poggio, T.: A biologically inspired system for action recognition. In: Proceedings of the 11th International Conference on Computer Vision, pp. 1–8 (2007)
Serre, T., Wolf, L., Poggio, T.: Object recognition with features inspired by visual cortex. In: Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 994–1000 (June 2005)
Xiao, D.K., Raiguel, S., Marcar, V., Orban, G.A.: The spatial distribution of the antagonistic surround of MT/V5 neurons. Cereb Cortex 7(7), 662–677 (1997)
Xiao, D., Raiguel, S., Marcar, V., Koenderink, J., Orban, G.A.: Spatial heterogeneity of inhibitory surrounds in the middle temporal visual area. Proceedings of the National Academy of Sciences 92(24), 11303–11306 (1995)
Escobar, M., Masson, G., Kornprobst, P.: A simple mechanism to reproduce the neural solution of the aperture problem in monkey area MT. Research Report 6579, INRIA (2008)
Tsotsos, J., Liu, Y., Martinez-Trujillo, J., Pomplun, M., Simine, E., Zhou, K.: Attending to visual motion. Computer Vision and Image Understanding 100, 3–40 (2005)
Nowlan, S., Sejnowski, T.: A selection model for motion processing in area MT of primates. J. Neuroscience 15, 1195–1214 (1995)
Rust, N., Mante, V., Simoncelli, E., Movshon, J.: How MT cells analyze the motion of visual patterns. Nature Neuroscience (11), 1421–1431 (2006)
Simoncelli, E.P., Heeger, D.: A model of neuronal responses in visual area MT. Vision Research 38, 743–761 (1998)
Grzywacz, N., Yuille, A.: A model for the estimate of local image velocity by cells on the visual cortex. Proc. R. Soc. Lond. B. Biol. Sci. 239(1295), 129–161 (1990)
Berzhanskaya, J., Grossberg, S., Mingolla, E.: Laminar cortical dynamics of visual form and motion interactions during coherent object motion perception. Spatial Vision 20(4), 337–395 (2007)
Bayerl, P., Neumann, H.: Disambiguating visual motion by form–motion interaction – a computational model. International Journal of Computer Vision 72(1), 27–45 (2007)
Adelson, E., Bergen, J.: Spatiotemporal energy models for the perception of motion. Journal of the Optical Society of America A 2, 284–299 (1985)
Carandini, M., Demb, J.B., Mante, V., Tollhurst, D.J., Dan, Y., Olshausen, B.A., Gallant, J.L., Rust, N.C.: Do we know what the early visual system does? Journal of Neuroscience 25(46), 10577–10597 (2005)
Robson, J.: Spatial and temporal contrast-sensitivity functions of the visual system. J. Opt. Soc. Am. 69, 1141–1142 (1966)
Albrecht, D., Geisler, W., Crane, A.: Nonlinear properties of visual cortex neurons: Temporal dynamics, stimulus selectivity, neural performance, pp. 747–764. MIT Press, Cambridge (2003)
Destexhe, A., Rudolph, M., Paré, D.: The high-conductance state of neocortical neurons in vivo. Nature Reviews Neuroscience 4, 739–751 (2003)
Priebe, N., Cassanello, C., Lisberger, S.: The neural representation of speed in macaque area MT/V5. Journal of Neuroscience 23(13), 5650–5661 (2003)
Perrone, J., Thiele, A.: Speed skills: measuring the visual speed analyzing properties of primate mt neurons. Nature Neuroscience 4(5), 526–532 (2001)
Liu, J., Newsome, W.T.: Functional organization of speed tuned neurons in visual area MT. Journal of Neurophysiology 89, 246–256 (2003)
Perrone, J.: A visual motion sensor based on the properties of V1 and MT neurons. Vision Research 44, 1733–1755 (2004)
Huang, X., Albright, T.D., Stoner, G.R.: Adaptive surround modulation in cortical area MT. Neuron. 53, 761–770 (2007)
Topsoe, F.: Some inequalities for information divergence and related measures of discrimination. IEEE Transactions on information theory 46(4), 1602–1609 (2000)
Zelnik-Manor, L., Irani, M.: Statistical analysis of dynamic actions. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(9), 1530–1535 (2006)
Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. Proceedings of the 10th International Conference on Computer Vision 2, 1395–1402 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Escobar, MJ., Kornprobst, P. (2008). Action Recognition with a Bio–inspired Feedforward Motion Processing Model: The Richness of Center-Surround Interactions. In: Forsyth, D., Torr, P., Zisserman, A. (eds) Computer Vision – ECCV 2008. ECCV 2008. Lecture Notes in Computer Science, vol 5305. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88693-8_14
Download citation
DOI: https://doi.org/10.1007/978-3-540-88693-8_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88692-1
Online ISBN: 978-3-540-88693-8
eBook Packages: Computer ScienceComputer Science (R0)