Skip to main content

Recognizing Skeleton-Based Hand Gestures by a Spatio-Temporal Network

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track (ECML PKDD 2021)

Abstract

A key challenge in skeleton-based hand gesture recognition is the fact that a gesture can often be performed in several different ways, with each consisting of its own configuration of poses and their spatio-temporal dependencies. This leads us to define a spatio-temporal network model that explicitly characterizes these internal configurations of poses and their local spatio-temporal dependencies. The model introduces a latent vector variable from the coordinates embedding to characterize these unique fine-grained configurations among joints of a particular hand gesture. Furthermore, an attention scorer is devised to exchange joint-pose information in the encoder structure, and as a result, all local spatio-temporal dependencies are globally consistent. Empirical evaluations on two benchmark datasets and one in-house dataset suggest our approach significantly outperforms the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Canavan, S., Keyes, W., Mccormick, R., Kunnumpurath, J., Hoelzel, T., Yin, L.: Hand gesture recognition using a skeleton-based feature representation with a random regression forest. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 2364–2368. IEEE (2017)

    Google Scholar 

  2. Chen, X., Guo, H., Wang, G., Zhang, L.: Motion feature augmented recurrent neural network for skeleton-based dynamic hand gesture recognition. In: 2017 IEEE International Conference on Image Processing (ICIP), pp. 2881–2885. IEEE (2017)

    Google Scholar 

  3. Chen, Y., Zhao, L., Peng, X., Yuan, J., Metaxas, D.N.: Construct dynamic graphs for hand gesture recognition via spatial-temporal attention. arXiv preprint arXiv:1907.08871 (2019)

  4. De Smedt, Q., Wannous, H., Vandeborre, J.-P.: 3D hand gesture recognition by analysing set-of-joints trajectories. In: Wannous, H., Pala, P., Daoudi, M., Flórez-Revuelta, F. (eds.) UHA3DS 2016. LNCS, vol. 10188, pp. 86–97. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91863-1_7

    Chapter  Google Scholar 

  5. De Smedt, Q., Wannous, H., Vandeborre, J.P.: Skeleton-based dynamic hand gesture recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1–9 (2016)

    Google Scholar 

  6. De Smedt, Q., Wannous, H., Vandeborre, J.P., Guerry, J., Le Saux, B., Filliat, D.: Shrec’17 track: 3D hand gesture recognition using a depth and skeletal dataset. In: 3DOR-10th Eurographics Workshop on 3D Object Retrieval, pp. 1–6 (2017)

    Google Scholar 

  7. Devineau, G., Moutarde, F., Xi, W., Yang, J.: Deep learning for hand gesture recognition on skeletal data. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 106–113. IEEE (2018)

    Google Scholar 

  8. Feix, T., Pawlik, R., Schmiedmayer, H.B., Romero, J., Kragic, D.: A comprehensive grasp taxonomy. In: Robotics, Science and Systems: Workshop on Understanding the Human Hand for Advancing Robotic Manipulation, Seattle, WA, USA, vol. 2, pp. 2–3 (2009)

    Google Scholar 

  9. Ghotkar, A., Vidap, P., Deo, K.: Dynamic hand gesture recognition using hidden Markov model by Microsoft Kinect sensor. Int. J. Comput. Appl. 150(5), 5–9 (2016)

    Google Scholar 

  10. Hou, J., Wang, G., Chen, X., Xue, J.H., Zhu, R., Yang, H.: Spatial-temporal attention res-TCN for skeleton-based dynamic hand gesture recognition. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops, pp. 273–286 (2018)

    Google Scholar 

  11. Hu, J.F., Fan, Z.C., Liao, J., Liu, L.: Predicting long-term skeletal motions by a spatio-temporal hierarchical recurrent network. In: the 24th European Conference on Artificial Intelligence (ECAI), pp. 2720–2727 (2020)

    Google Scholar 

  12. Sharath Kumar, Y.H., Vinutha, V.: Hand gesture recognition for sign language: a skeleton approach. In: Das, S., Pal, T., Kar, S., Satapathy, S.C., Mandal, J.K. (eds.) Proceedings of the 4th International Conference on Frontiers in Intelligent Computing: Theory and Applications (FICTA) 2015. AISC, vol. 404, pp. 611–623. Springer, New Delhi (2016). https://doi.org/10.1007/978-81-322-2695-6_52

    Chapter  Google Scholar 

  13. Lee, D.H., Hong, K.S.: Game interface using hand gesture recognition. In: 5th International Conference on Computer Sciences and Convergence Information Technology, pp. 1092–1097. IEEE (2010)

    Google Scholar 

  14. Lin, H.I., Hsu, M.H., Chen, W.K.: Human hand gesture recognition using a convolution neural network. In: 2014 IEEE International Conference on Automation Science and Engineering (CASE), pp. 1038–1043. IEEE (2014)

    Google Scholar 

  15. Liu, J., Liu, Y., Wang, Y., Prinet, V., Xiang, S., Pan, C.: Decoupled representation learning for skeleton-based gesture recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5751–5760 (2020)

    Google Scholar 

  16. Maghoumi, M., LaViola, J.J.: DeepGRU: deep gesture recognition utility. In: Bebis, G., et al. (eds.) ISVC 2019. LNCS, vol. 11844, pp. 16–31. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33720-9_2

    Chapter  Google Scholar 

  17. Nguyen, X.S., Brun, L., Lézoray, O., Bougleux, S.: A neural network based on SPD manifold learning for skeleton-based hand gesture recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12036–12045 (2019)

    Google Scholar 

  18. Nguyen, X.S., Brun, L., Lezoray, O., Bougleux, S.: Skeleton-based hand gesture recognition by learning SPD matrices with neural networks. In: 2019 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019), pp. 1–5. IEEE (2019)

    Google Scholar 

  19. Nunez, J.C., Cabido, R., Pantrigo, J.J., Montemayor, A.S., Velez, J.F.: Convolutional neural networks and long short-term memory for skeleton-based human activity and hand gesture recognition. Pattern Recogn. 76, 80–94 (2018)

    Article  Google Scholar 

  20. Pezzuoli, F., Corona, D., Corradini, M.L.: Recognition and classification of dynamic hand gestures by a wearable data-glove. SN Comput. Sci. 2(1), 1–9 (2021)

    Article  Google Scholar 

  21. Weng, J., Liu, M., Jiang, X., Yuan, J.: Deformable pose traversal convolution for 3D action and gesture recognition. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 136–152 (2018)

    Google Scholar 

  22. Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)

    Google Scholar 

  23. Yang, F., Wu, Y., Sakti, S., Nakamura, S.: Make skeleton-based action recognition model smaller, faster and better. In: Proceedings of the ACM Multimedia Asia, pp. 1–6 (2019)

    Google Scholar 

  24. Zeng, Y., Fu, J., Chao, H., Guo, B.: Learning pyramid-context encoder network for high-quality image inpainting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1486–1494 (2019)

    Google Scholar 

  25. Zhang, W., Lin, Z., Cheng, J., Ma, C., Deng, X., Wang, H.: STA-GCN: two-stream graph convolutional network with spatial-temporal attention for hand gesture recognition. Vis. Comput. 36(10), 2433–2444 (2020)

    Article  Google Scholar 

Download references

Acknowledgement

This work was supported by grants from the National Major Science and Technology Projects of China (grant no. 2018AAA0100703) and the National Natural Science Foundation of China (grant no. 61977012).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Liu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, X., Liao, J., Liu, L. (2021). Recognizing Skeleton-Based Hand Gestures by a Spatio-Temporal Network. In: Dong, Y., Kourtellis, N., Hammer, B., Lozano, J.A. (eds) Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track. ECML PKDD 2021. Lecture Notes in Computer Science(), vol 12978. Springer, Cham. https://doi.org/10.1007/978-3-030-86514-6_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-86514-6_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-86513-9

  • Online ISBN: 978-3-030-86514-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics