Abstract
This work targets people identification in video based on the way they walk (i.e. gait). While classical methods typically derive gait signatures from sequences of binary silhouettes, in this work we explore the use of convolutional neural networks (CNN) for learning high-level descriptors from low-level motion features (i.e. optical flow components). We carry out a thorough experimental evaluation of the proposed CNN architecture on the challenging TUM-GAID dataset. The experimental results indicate that using spatio-temporal cuboids of optical flow as input data for CNN allows to obtain state-of-the-art results on the gait task, both for identification and gender recognition, with an image resolution eight times lower than the previously reported results (i.e. \(80\times 60\) pixels).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Note that TUM-GAID distinguishes between training/test subjects and training/test sequences. Test sequences are never used for training or validation of the model.
References
Alotaibi, M., Mahmood, A.: Improved gait recognition based on specialized deep convolutional neural networks. In: 2015 IEEE Applied Imagery Pattern Recognition Workshop (AIPR), pp. 1–7, October 2015
Barnich, O., Droogenbroeck, M.V.: Frontal-view gait recognition by intra- and inter-frame rectangle size distribution. Pattern Recogn. Lett. 30(10), 893–901 (2009)
Castro, F.M., Marín-Jiménez, M.J., Guil, N.: Empirical study of audio-visual features fusion for gait recognition. In: Azzopardi, G., Petkov, N. (eds.) CAIP 2015. LNCS, vol. 9256, pp. 727–739. Springer, Cham (2015). doi:10.1007/978-3-319-23192-1_61
Castro, F.M., Marín-Jiménez, M., Guil Mata, N., Muñoz Salinas, R.: Fisher motion descriptor for multiview gait recognition. Int. J. Patt. Recogn. Artif. Intell. 31(1) (2017)
Chetlur, S., Woolley, C., Vandermersch, P., Cohen, J., Tran, J., Catanzaro, B., Shelhamer, E.: cudnn: Efficient primitives for deep learning. CoRR abs/1410.0759 (2014)
Farnebäck, G.: Two-frame motion estimation based on polynomial expansion. In: Bigun, J., Gustavsson, T. (eds.) SCIA 2003. LNCS, vol. 2749, pp. 363–370. Springer, Heidelberg (2003). doi:10.1007/3-540-45103-X_50
Gálai, B., Benedek, C.: Feature selection for lidar-based gait recognition. In: 2015 International Workshop on Computational Intelligence for Multimedia Understanding (IWCIM), pp. 1–5 (2015)
Guan, Y., Li, C.T.: A robust speed-invariant gait recognition system for walker and runner identification. In: International Conference on Biometrics (ICB), pp. 1–8 (2013)
Han, J., Bhanu, B.: Individual recognition using gait energy image. IEEE PAMI 28(2), 316–322 (2006)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778, June 2016
Hofmann, M., Geiger, J., Bachmann, S., Schuller, B., Rigoll, G.: The TUM gait from audio, image and depth (GAID) database: multimodal recognition of subjects and traits. J. Vis. Commun. Image Represent. 25(1), 195–206 (2014)
Hossain, E., Chetty, G.: Multimodal feature learning for gait biometric based human identity recognition. In: Neural Information Processing, pp. 721–728 (2013)
Hu, W., Tan, T., Wang, L., Maybank, S.: A survey on visual surveillance of object motion and behaviors. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 34(3), 334–352 (2004)
Ji, S., Xu, W., Yang, M., Yu, K.: 3D Convolutional Neural Networks for human action recognition. IEEE PAMI 35(1), 221–231 (2013)
KaewTraKulPong, P., Bowden, R.: An improved adaptive background mixture model for real-time tracking with shadow detection. In: Remagnino, P., Jones, G.A., Paragios, N., Regazzoni, C.S. (eds.) Video-Based Surveillance Systems, pp. 135–144. Springer, New York (2002)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: NIPS, pp. 568–576 (2014)
Soomro, K., Zamir, A.R., Shah, M.: UCF101: a dataset of 101 human action classes from videos in the wild. In: CRCV-TR-12-01, November 2012
Tran, D., Bourdev, L.D., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: ICCV. IEEE (2015)
Vedaldi, A., Lenc, K.: MatConvNet - convolutional neural networks for MATLAB. In: Proceedings of the ACM International Conference on Multimedia (2015)
Whytock, T., Belyaev, A., Robertson, N.: Dynamic distance-based shape features for gait recognition. J. Math. Imaging Vis. 50(3), 314–326 (2014)
Wu, Z., Huang, Y., Wang, L., Wang, X., Tan, T.: A comprehensive study on cross-view gait based human identification with deep CNNs. IEEE PAMI PP(99) (2016)
Wu, Z., Huang, Y., Wang, L.: Learning representative deep features for image set analysis. IEEE Trans. Multimedia 17(11), 1960–1968 (2015)
Zeng, W., Wang, C., Yang, F.: Silhouette-based gait recognition via deterministic learning. Pattern Recogn. 47(11), 3568–3584 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Castro, F.M., Marín-Jiménez, M.J., Guil, N., Pérez de la Blanca, N. (2017). Automatic Learning of Gait Signatures for People Identification. In: Rojas, I., Joya, G., Catala, A. (eds) Advances in Computational Intelligence. IWANN 2017. Lecture Notes in Computer Science(), vol 10306. Springer, Cham. https://doi.org/10.1007/978-3-319-59147-6_23
Download citation
DOI: https://doi.org/10.1007/978-3-319-59147-6_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59146-9
Online ISBN: 978-3-319-59147-6
eBook Packages: Computer ScienceComputer Science (R0)