Skip to main content

Towards End-to-End Gesture Recognition with Recurrent Neural Networks

  • Conference paper
  • First Online:
Proceedings of 2018 Chinese Intelligent Systems Conference

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 529))

  • 767 Accesses

Abstract

With the development of smart devices, gesture recognition is used in more and more fields. The current gesture recognition devices on the market are inconvenient and expensive. Human motion analysis and recognition based on attitude sensor is a new field. The algorithm based on the recurrent neural network takes into account the timing information of the actions and can better resolve the uncertainty of the human motion in time, but as the training sample increases, the efficiency becomes lower. This paper proposes an action recognition method based on Connectionist temporal classification for sequence learning. This method realizes end-to-end recognition of gestures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. C. Maggioni, A novel gestural input device for virtual reality, in IEEE Virtual Reality Annual (Seattle, Wash, 1993), pp. 118–124

    Google Scholar 

  2. K. Vaananen, K. Boehm, Gesture driven interaction as a human factor in virtual environments, an approach with neural networks, in Virtual Reality Systems, Chap. 7 ed. by R. Earnshaw, M. Gigante, H. Jones (Academic Press, Cambridge, MA, 1993), pp. 93–106

    Chapter  Google Scholar 

  3. J. Davis, M. Shah, Visual gesture recognition, in Visualization and Image Signal Processing (Apr 1994), pp. 101–106

    Google Scholar 

  4. C. Cedras, M. Shah, Motion based recognition: a survey, in Image and Vision Computing (Mar 1995), pp. 129–155

    Google Scholar 

  5. M. Andriluka, L. Pishchulin, P. Gehler, B. Schiele, 2d human pose estimation: new benchmark and state of the art analysis, in 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (IEEE, New York, 2014), pp. 3686–3693

    Google Scholar 

  6. B. Sapp, B. Taskar, Modec: multimodal decomposable models for human pose estimation, in 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (IEEE, New York, 2013), pp. 3674–3681

    Google Scholar 

  7. J. Dai, K. He, J. Sun, Convolutional feature masking for joint object and stuff segmentation, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015), pp. 3992–4000

    Google Scholar 

  8. P. Zappi, T. Stiefmeier, Activity recognition from on-body sensors by classier fusion: sensor scalability and robustness, in IEEE Intelligent Sensors, Sensor Networks and Information (Dec 2007), pp. 281–286

    Google Scholar 

  9. J. Choi, K. Song, S. Lee, Enabling a gesture based numeric input on mobile phones, in IEEE International Conference on Consumer Electronics, Xi-an, China (2011), pp. 151–152

    Google Scholar 

  10. Y. Jia, Robust control with decoupling performance for steering and traction of 4WS vehicles under velocity-varying motion. IEEE Trans. Control Syst. Technol. 8(3), 554–569 (2000)

    Article  Google Scholar 

  11. Y. Jia, Alternative proofs for improved LMI representations for the analysis and the design of continuous-time systems with polytopic type uncertainty: a predictive approach. IEEE Trans. Autom. Control 48(8), 1413–1416 (2003)

    Article  MathSciNet  Google Scholar 

  12. N. Nishida, H. Nakayama, Multimodal gesture recognition using multi-stream recurrent neural network, in Pacific-Rim Symposium on Image and Video Technology (2015)

    Google Scholar 

  13. A. Graves, S. Fernandez, F. Gomez, J. Schmidhuber, Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks, in ICML, Pittsburgh, USA (2006)

    Google Scholar 

  14. H. Sak, A. Senior, K. Rao, O. Irsoy, A. Graves, F. Beaufays, J. Schalkwyk, Learning acoustic frame labeling for speech recognition with recurrent neural networks, in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (IEEE, New York, 2015), pp. 4280–4284

    Google Scholar 

Download references

Acknowledgements

The work was supported by National Natural Science Foundation of China (No. 61433003, No. 61573174, and No. 61273150).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xuemei Ren .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Du, T., Ren, X. (2019). Towards End-to-End Gesture Recognition with Recurrent Neural Networks. In: Jia, Y., Du, J., Zhang, W. (eds) Proceedings of 2018 Chinese Intelligent Systems Conference. Lecture Notes in Electrical Engineering, vol 529. Springer, Singapore. https://doi.org/10.1007/978-981-13-2291-4_15

Download citation

Publish with us

Policies and ethics