Skip to main content

Vision-Based Assistive Systems for Deaf and Hearing Impaired People

  • Chapter
  • First Online:
Computer Vision in Advanced Control Systems-5

Part of the book series: Intelligent Systems Reference Library ((ISRL,volume 175))

Abstract

Sign language is the way of communication among deaf and hearing impaired community and consist of a combination of hand movements and facial expressions. Successful efforts in computer vision-based research within the last years paved the path for first automatic sign language recognition systems. However, unresolved challenges, such as cultural differences in the sign languages of the world, lack of the representative databases for model training, relatively small size of the region-of-interest, issues due to occlusion, etc. keep automatic sign language recognition reliability still far from human-level performance, especially for the Russian sign language. To address this issue, we present a framework and an automatic system for one-handed gestures of Russian Sign Language (RSL) recognition. The developed system supports both online and offline modes and is able to recognize 44 classes of RSL one-handed gestures with almost 70% of accuracy. The system is based on color-depth Kinect v2 sensor and trained on TheRuSLan database using a combination of state-of-the-art deep learning approaches. The future research will focus on extracting additional features, expanding the data set, and increasing the amount of recognizable gestures with two-handed gestures. The developed vision-based RSL recognition system is meant as an auxiliary system for deaf and hearing impaired people.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Seymour, M., Tsoeu, M.: A mobile application for South African sign language (SASL) recognition. In: Proceedings of the AFRICON, pp. 1–5 (2015)

    Google Scholar 

  2. Pan, T.Y., Lo, L.Y., Yeh, C.W., Li, J.W., Liu, H.T., Hu, M.C.: Real-time sign language recognition in complex background scene based on a hierarchical clustering classification method. In: IEEE 2nd International Conference Multimedia Big Data, pp. 64–67 (2016)

    Google Scholar 

  3. Jin, C.M., Omar, Z., Jaward, M.H.: A mobile application of American sign language translation via image processing algorithms. In: IEEE Region 10 Symposium, pp. 104–109 (2016)

    Google Scholar 

  4. Li, S.Z., Yu, B., Wu, W., Su, S.Z., Ji, R.R.: Feature learning based on SAE–PCA network for human gesture recognition in RGBD images. Neurocomputing 151, 565–573 (2015)

    Article  Google Scholar 

  5. Just, A., Marcel, S.: A comparative study of two state-of-the-art sequence processing techniques for hand gesture recognition. Comput. Vis. Image Underst. 113(4), 532–543 (2009)

    Article  Google Scholar 

  6. Chen, F.S., Fu, C.M., Huang, C.L.: Hand gesture recognition using a real-time tracking method and hidden Markov models. Image Vis. Comput. 21, 745–758 (2003)

    Article  Google Scholar 

  7. Marcel, S., Bernier, O., Viallet, J.E., Collobert, D.: Hand gesture recognition using input/output hidden Markov models. In: IEEE Automatic Face and Gesture Recognition, pp. 456–461 (2000)

    Google Scholar 

  8. Al-Rousan, M., Assaleh, K., Talaa, A.: Video-based signer-independent Arabic sign language recognition using hidden Markov models. Appl. Soft Comput. 9(3), 990–999 (2009)

    Article  Google Scholar 

  9. Nagi, J., Ducatelle, F., Di Caro, G.A., Cire¸ D., Meier, U., Giusti, A.: Max-pooling convolutional neural networks for vision-based hand gesture recognition. In: IEEE International Conference on Signal and Image Processing Applications, pp. 342–347 (2011)

    Google Scholar 

  10. Huang, J., Zhou, W., Li, H., Li, W.: Sign language recognition using 3D convolutional neural networks. In: IEEE International Conference on Multimedia and Expo, pp. 1–6 (2015)

    Google Scholar 

  11. Pigou, L., Dieleman, S., Kindermans, P.J., Schrauwen, B.: Sign language recognition using convolutional neural networks. In: Workshop at the European Conference on Computer Vision, pp. 572–578 (2015)

    Chapter  Google Scholar 

  12. Liu, T., Zhou, W., Li, H.: Sign language recognition with long short-term memory. In: International Conference on Image Processing, pp. 2871–2875 (2016)

    Google Scholar 

  13. Thalange, A., Dixit, S.: Cohst and wavelet features based static ASL numbers recognition. Proc. Comput. Sci. 92, 455–460 (2016)

    Article  Google Scholar 

  14. Hartanto, R., Kartikasari, A.: Android based real-time static Indonesian sign language recognition system prototype. In: 8th International Conference on Technology and Electrical Engineering, pp. 1–6 (2016)

    Google Scholar 

  15. Vintimilla, M.G., Alulema, D., Morocho, D., Proano, M., Encalada, F., Granizo, E.: Development and implementation of an application that translates the alphabet and the numbers from 1 to 10 from sign language to text to help hearing impaired by android mobile devices. In: IEEE International Conference on Automatica, pp. 1–5 (2016)

    Google Scholar 

  16. Neiva, D.H., Zanchettin, C.: Gesture recognition: a review focusing on sign language in a mobile context. Expert Syst. Appl. 103, 159–183 (2018)

    Article  Google Scholar 

  17. Ryumin, D., Kagirov, I., Ivanko, D., Axyonov, A., Karpov, A.A.: Automatic detection and recognition of 3D manual gestures for human-machine interaction. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. XLII-2/W12, 179–183 (2019)

    Article  Google Scholar 

  18. Shukor, A.Z., Mikon, M.F., Jamaluddin, M.H., bin Ali, F., Asyraf, M.F., Bahar, M.B.: A new data glove approach for Malaysian sign language detection. Proc. Comput. Sci. 76, 60–67 (2015)

    Article  Google Scholar 

  19. Khambadkar, V., Folmer, E.: A tactile-proprioceptive communication aid for users who are deafblind. In: IEEE Haptics Symposium, pp. 239–245 (2014)

    Google Scholar 

  20. Bajpai, D., Porov, U., Srivastav, G., Sachan, N.: Two way wireless data communication and american sign language translator glove for images text and speech display on mobile phone. In: 5th International Conference on Communication Systems and Network Technologies, pp. 578–585 (2015)

    Google Scholar 

  21. Prasuhn, L., Oyamada, Y., Mochizuki, Y., Ishikawa, H.: A HOG-based hand gesture recognition system on a mobile device. IEEE International Conference on Image Processing, pp. 3973–3977 (2014)

    Google Scholar 

  22. Sharma, R.P., Verma, G.K.: Human computer interaction using hand gesture. Proc. Comput. Sci. 54, 721–727 (2015)

    Article  Google Scholar 

  23. Kosmidou, V.E., Hadjileontiadis, L.J.: Sign language recognition using intrinsic-mode sample entropy on sEMG and accelerometer data. IEEE Trans. Biomed. Eng. 56(12), 2879–2890 (2009)

    Article  Google Scholar 

  24. Rao, G.A., Kishore, P.: Selfie video based continuous Indian sign language recognition system. Ain Shams Eng. J. 9(4), 1929–1939 (2018)

    Article  Google Scholar 

  25. Celebi, S., Aydin, A.S., Temiz, T.T., Arici, T.: Gesture recognition using skeleton data with weighted dynamic time warping. In: International Conference on Computer Vision Theory and Application, pp. 620–625 (2013)

    Google Scholar 

  26. Kapuscinski, T.: Using hierarchical temporal memory for vision-based hand shape recognition under large variations in hand’s rotation. In: Rutkowski, L., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) Artifical Intelligence and Soft Computing. LNCS, vol. 6114, pp. 272–279. Springer, Berlin, Heidelberg (2010)

    Chapter  Google Scholar 

  27. Hakkun, R.Y., Baharuddin, A.: Sign language learning based on android for deaf and speech impaired people. In: International Electronics Symposium, pp. 114–117 (2015)

    Google Scholar 

  28. Elons, A., Ahmed, M., Shedid, H.: Facial expressions recognition for Arabic sign language translation. In: 9th International Conference on Computer Engineering & Systems, pp. 330–335 (2014)

    Google Scholar 

  29. Madhuri, Y., Anitha, G., Anburajan, M. Vision-based sign language translation device. In: International Conference Information Communication and Embedded Systems, pp. 565–568 (2013)

    Google Scholar 

  30. Luo, R.C., Wu, Y., Lin, P.: Multimodal information fusion for human-robot interaction. In: 10th Jubilee International Symposium on Applied Computational Intelligence and Informatics, pp. 535–540 (2015)

    Google Scholar 

  31. Almeida, S.G.M., Guimaraes, F.G., Ramírez, J.A.: Feature extraction in Brazilian sign language recognition based on phonological structure and using RGB-D sensors. Expert Syst. Appl. 41(16), 7259–7271 (2014)

    Article  Google Scholar 

  32. Kau, L.J., Su, W.L., Yu, P.J., Wei, S.J.: A real-time portable sign language translation system. In: 58th International Midwest Symposium on Circuits and Systems, pp. 1–4 (2015)

    Google Scholar 

  33. Paulson, B., Cummings, D., Hammond, T.: Object interaction detection using hand posture cues in an office setting. Int. J. Hum. Comput. Stud. 69(1), 19–29 (2011)

    Article  Google Scholar 

  34. Devi, S., Deb, S.: Low cost tangible glove for translating sign gestures to speech and text in Hindi language. In: 3rd International Conference on Computational Intelligence & Communication Technology, pp. 1–5 (2017)

    Google Scholar 

  35. Pisharady, P.K., Saerbeck, M.: Recent methods and databases in vision-based hand gesture recognition: a review. Comput. Vis. Image Underst. 141, 152–165 (2015)

    Article  Google Scholar 

  36. Pugeault, N., Bowden, R.: Spelling it out: real-time ASL finger-spelling recognition. In: 1st IEEE Workshop on Consumer Depth Cameras for Computer Vision, pp. 1114–1119 (2011)

    Google Scholar 

  37. Kawulok, M., Kawulok, J., Nalepa, J.: Spatial-based skin detection using discriminative skin-presence features. Pattern Recognit. 41, 3–13 (2014)

    Article  Google Scholar 

  38. Kollorz, E., Penne, J., Hornegger, J., Barke, A.: Gesture recognition with a time-of-flight camera. Int. J. Intell. Syst. Technol. Appl. 5, 334–343 (2007)

    Article  Google Scholar 

  39. Pisharady, P.K., Vadakkepat, P., Loh, A.P.: Attention based detection and recognition of hand postures against complex backgrounds. Int. J. Comput. Vis. 101(3), 403–419 (2013)

    Article  Google Scholar 

  40. Chuang, Y., Chen, L., Chen, G.: Saliency-guided improvement for hand posture detection and recognition. Neurocomputing 133, 404–415 (2014)

    Article  Google Scholar 

  41. Zhou, R., Junsong, Y., Zhengyou, Z.: Robust hand gesture recognition based on finger-earth movers distance with a commodity depth camera. In: ACM International Conference on Multimedia, pp. 1093–1096 (2011)

    Google Scholar 

  42. Malgireddy, M.R., Inwogu, I., Govindaraju, V.: A temporal Bayesian model for classifying, detecting and localizing activities in video sequences. In: IEEE Computer Vision and Pattern Recognition, Workshops, pp. 43–48 (2012)

    Google Scholar 

  43. Keskin, C., Kirac, F., Kara, Y., Akarun, L.: Randomized decision forests for static and dynamic hand shape classification. In: IEEE Computer Vision and Pattern Recognition, Workshops, pp. 31–46 (2012)

    Google Scholar 

  44. Simon, F., Helena, M.M., Pushmeet, K., Sebastian, N.: Instructing people for training gestural interactive systems. In: International Conference on Human Factors in Computing Systems, pp. 1737–1746 (2012)

    Google Scholar 

  45. Escalera, S., Gonzalez, J., Baro, X., Reyes, M., Lopes, O., Guyon, I., Athistos, V., Escalante, H.J.: Multi-modal gesture recognition change 2013: dataset and results. In: 15th ACM International Conference on Multimodal Interaction, pp. 445–452 (2013)

    Google Scholar 

  46. Chen, M., Al Regib, G., Juang, B. H.: 6DMG: a new 6D motion gesture database. In: IEEE Computer Vision and Pattern Recognition Workshops, pp. 83–88 (2011)

    Google Scholar 

  47. Liu, L., Shao, L.: Learning discriminative representations from RGB-D video data. Int. J. Artif. Intell. 1493–1500 (2013)

    Google Scholar 

  48. Ryumin, D., Ivanko, D., Axyonov, A., Kagirov, I., Karpov, A., Zelezny, M.: Human-robot interaction with smart shopping trolley using sign language: data collection. In: IEEE International Conference Pervasive Computing and Communications Workshops, pp. 1–6 (2019)

    Google Scholar 

  49. Yale, S., David, D., Randall, D.: Tracking body and hands for gesture recognition: NATOPS aircraft handling signals database. In: IEEE International Conference Automatic Face and Gesture Recognition, pp. 500–506 (2011)

    Google Scholar 

  50. Shen, X.H., Hua, G., Williams, L., Wu, Y.: Dynamic hand gesture recognition: an exemplar-based approach from motion divergence fields. Image Vis. Comput. 30(3), 227–235 (2012)

    Article  Google Scholar 

  51. Kim, T.K., Wong, S.F., Cipolla, R.: Tensor canonical correlation analysis for action classification. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007)

    Google Scholar 

  52. Zhang, Z.: Microsoft kinect sensor and its effect. IEEE Multimed. 19(2), 4–10 (2012)

    Article  Google Scholar 

  53. Rahman, M.W., Gavrilova, M.L.: Kinect gait skeletal joint feature-based person identification. In: IEEE 16th International Conference on Cognitive Informatics & Cognitive Computing, pp. 423–430 (2017)

    Google Scholar 

  54. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 511–518 (2001)

    Google Scholar 

  55. Pulli, K., Baksheev, A., Kornyakov, K., Eruhimov, V.: Real-time computer vision with OpenCV. Commun. ACM 55(6), 61–69 (2012)

    Article  Google Scholar 

  56. OpenCV: OpenCV library. https://opencv.org. Accessed 7 Aug 2019

  57. Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: International Conference on Computer Vision, pp. 886–893 (2005)

    Google Scholar 

  58. King, D.E.: Dlib-ml: a machine learning toolkit. J. Mach. Learn. Res. 10, 1755–1758 (2009)

    Google Scholar 

  59. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  60. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)

    Google Scholar 

  61. Hochreiter, S., Schmidhuber, L.: Long short-term memory. Neural Comput. 9, 1–32 (1997)

    Article  Google Scholar 

  62. Donahue, J., Hendricks, A.L., Guadarrama, S., Rohrbach. M., Venugopalan, S., Darrell, T., Saenko, K.: Long-term recurrent convolutional networks for visual recognition and description. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)

    Google Scholar 

  63. Geilman, I.: Russian sign language dictionary. In: vol. 2, St. Petersburg, Prana (2004)

    Google Scholar 

  64. Viola, P., Jones, M.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004)

    Article  Google Scholar 

  65. Castrillyn, M., Deniz, O., Hernandez, D., Lorenzo, J.: A comparison of face and facial feature detectors based on the Viola-Jones general object detection framework. Mach. Vis. Appl. 22(3), 481–494 (2011)

    Google Scholar 

  66. Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Thirteen International Conference on Machine Learning, vol. 96, pp. 148–156 (1996)

    Google Scholar 

  67. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C-Y., Berg, A.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds) Computer Vision—ECCV 2016: European Conference on Computer Vision. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016)

    Chapter  Google Scholar 

  68. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: 2nd ACM International Conference on Multimedia, pp. 675–678 (2014)

    Google Scholar 

  69. Caffe: a fast open framework for deep learning (2019). http://caffe.berkeleyvision.org. Accessed 7 Aug 2019

  70. Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M.: Tensorflow: a system for large-scale machine learning. In: 12th Symposium on Operating Systems Design and Implementation, pp. 265–283 (2016)

    Google Scholar 

  71. TensorFlow: an end-to-end open source machine learning platform. https://www.tensorflow.org. Accessed 7 Aug 2019

  72. Déniz, O., Bueno, G., Salido, J., De la Torre, F.: Face recognition using histograms of oriented gradients. Pattern Recognit. Lett. 32(12), 1598–1603 (2011)

    Article  Google Scholar 

  73. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27 (2011)

    Article  Google Scholar 

  74. Dlib: toolkit containing machine learning algorithms and tools for creating complex software. https://www.dlib.net. Accessed 7 Aug 2019

  75. King, D.E.: Max-margin object detection (2015). arXiv:1502.00046

  76. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)

    Google Scholar 

  77. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  78. Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)

    Article  Google Scholar 

  79. Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: British Machine Vision Conference, pp. 41.1–41.12 (2015)

    Google Scholar 

  80. Yang, S., Luo, P., Loy, C.C., Tang, X.: Wider face: a face detection benchmark. IEEE Conference on Computer Vision and Pattern Recognition, pp. 5525–5533 (2016)

    Google Scholar 

  81. Ng, H.W., Winkler, S.: A data-driven approach to cleaning large face datasets. In: IEEE International Conference on Image Processing, pp. 343–347 (2014)

    Google Scholar 

  82. Huang, J., Rathod, V., Sun, Ch., Zhu, M., Korattikara, A., Fathi, A., Fischer, I., Wojna, Z., Song, Y., Guadarrama, S., Murphy, K.: Speed/accuracy trade-offs for modern convolutional object detectors. In: 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 3296–3297 (2017)

    Google Scholar 

  83. LabelImg: A graphical image annotation tool. https://github.com/tzutalin/labelImg. Accessed 7 Aug 2019

  84. Chollet, F.: Keras. https://keras.io. Accessed 7 Aug 2019

Download references

Acknowledgements

This research was financially supported by the Ministry of Science and Higher Education of Russia, agreement No. 075-15-2019-1295 (reference RFMEFI61618X0095).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Dmitry Ryumin .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Ryumin, D., Ivanko, D., Kagirov, I., Axyonov, A., Karpov, A. (2020). Vision-Based Assistive Systems for Deaf and Hearing Impaired People. In: Favorskaya, M., Jain, L. (eds) Computer Vision in Advanced Control Systems-5. Intelligent Systems Reference Library, vol 175. Springer, Cham. https://doi.org/10.1007/978-3-030-33795-7_7

Download citation

Publish with us

Policies and ethics