Vision-Based Assistive Systems for Deaf and Hearing Impaired People

Ryumin, Dmitry; Ivanko, Denis; Kagirov, Ildar; Axyonov, Alexander; Karpov, Alexey

doi:10.1007/978-3-030-33795-7_7

Dmitry Ryumin⁵,
Denis Ivanko⁵,
Ildar Kagirov⁵,
Alexander Axyonov⁵ &
…
Alexey Karpov⁵

Part of the book series: Intelligent Systems Reference Library ((ISRL,volume 175))

426 Accesses
1 Citations

Abstract

Sign language is the way of communication among deaf and hearing impaired community and consist of a combination of hand movements and facial expressions. Successful efforts in computer vision-based research within the last years paved the path for first automatic sign language recognition systems. However, unresolved challenges, such as cultural differences in the sign languages of the world, lack of the representative databases for model training, relatively small size of the region-of-interest, issues due to occlusion, etc. keep automatic sign language recognition reliability still far from human-level performance, especially for the Russian sign language. To address this issue, we present a framework and an automatic system for one-handed gestures of Russian Sign Language (RSL) recognition. The developed system supports both online and offline modes and is able to recognize 44 classes of RSL one-handed gestures with almost 70% of accuracy. The system is based on color-depth Kinect v2 sensor and trained on TheRuSLan database using a combination of state-of-the-art deep learning approaches. The future research will focus on extracting additional features, expanding the data set, and increasing the amount of recognizable gestures with two-handed gestures. The developed vision-based RSL recognition system is meant as an auxiliary system for deaf and hearing impaired people.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Seymour, M., Tsoeu, M.: A mobile application for South African sign language (SASL) recognition. In: Proceedings of the AFRICON, pp. 1–5 (2015)
Google Scholar
Pan, T.Y., Lo, L.Y., Yeh, C.W., Li, J.W., Liu, H.T., Hu, M.C.: Real-time sign language recognition in complex background scene based on a hierarchical clustering classification method. In: IEEE 2nd International Conference Multimedia Big Data, pp. 64–67 (2016)
Google Scholar
Jin, C.M., Omar, Z., Jaward, M.H.: A mobile application of American sign language translation via image processing algorithms. In: IEEE Region 10 Symposium, pp. 104–109 (2016)
Google Scholar
Li, S.Z., Yu, B., Wu, W., Su, S.Z., Ji, R.R.: Feature learning based on SAE–PCA network for human gesture recognition in RGBD images. Neurocomputing 151, 565–573 (2015)
Article Google Scholar
Just, A., Marcel, S.: A comparative study of two state-of-the-art sequence processing techniques for hand gesture recognition. Comput. Vis. Image Underst. 113(4), 532–543 (2009)
Article Google Scholar
Chen, F.S., Fu, C.M., Huang, C.L.: Hand gesture recognition using a real-time tracking method and hidden Markov models. Image Vis. Comput. 21, 745–758 (2003)
Article Google Scholar
Marcel, S., Bernier, O., Viallet, J.E., Collobert, D.: Hand gesture recognition using input/output hidden Markov models. In: IEEE Automatic Face and Gesture Recognition, pp. 456–461 (2000)
Google Scholar
Al-Rousan, M., Assaleh, K., Talaa, A.: Video-based signer-independent Arabic sign language recognition using hidden Markov models. Appl. Soft Comput. 9(3), 990–999 (2009)
Article Google Scholar
Nagi, J., Ducatelle, F., Di Caro, G.A., Cire¸ D., Meier, U., Giusti, A.: Max-pooling convolutional neural networks for vision-based hand gesture recognition. In: IEEE International Conference on Signal and Image Processing Applications, pp. 342–347 (2011)
Google Scholar
Huang, J., Zhou, W., Li, H., Li, W.: Sign language recognition using 3D convolutional neural networks. In: IEEE International Conference on Multimedia and Expo, pp. 1–6 (2015)
Google Scholar
Pigou, L., Dieleman, S., Kindermans, P.J., Schrauwen, B.: Sign language recognition using convolutional neural networks. In: Workshop at the European Conference on Computer Vision, pp. 572–578 (2015)
Chapter Google Scholar
Liu, T., Zhou, W., Li, H.: Sign language recognition with long short-term memory. In: International Conference on Image Processing, pp. 2871–2875 (2016)
Google Scholar
Thalange, A., Dixit, S.: Cohst and wavelet features based static ASL numbers recognition. Proc. Comput. Sci. 92, 455–460 (2016)
Article Google Scholar
Hartanto, R., Kartikasari, A.: Android based real-time static Indonesian sign language recognition system prototype. In: 8th International Conference on Technology and Electrical Engineering, pp. 1–6 (2016)
Google Scholar
Vintimilla, M.G., Alulema, D., Morocho, D., Proano, M., Encalada, F., Granizo, E.: Development and implementation of an application that translates the alphabet and the numbers from 1 to 10 from sign language to text to help hearing impaired by android mobile devices. In: IEEE International Conference on Automatica, pp. 1–5 (2016)
Google Scholar
Neiva, D.H., Zanchettin, C.: Gesture recognition: a review focusing on sign language in a mobile context. Expert Syst. Appl. 103, 159–183 (2018)
Article Google Scholar
Ryumin, D., Kagirov, I., Ivanko, D., Axyonov, A., Karpov, A.A.: Automatic detection and recognition of 3D manual gestures for human-machine interaction. Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. XLII-2/W12, 179–183 (2019)
Article Google Scholar
Shukor, A.Z., Mikon, M.F., Jamaluddin, M.H., bin Ali, F., Asyraf, M.F., Bahar, M.B.: A new data glove approach for Malaysian sign language detection. Proc. Comput. Sci. 76, 60–67 (2015)
Article Google Scholar
Khambadkar, V., Folmer, E.: A tactile-proprioceptive communication aid for users who are deafblind. In: IEEE Haptics Symposium, pp. 239–245 (2014)
Google Scholar
Bajpai, D., Porov, U., Srivastav, G., Sachan, N.: Two way wireless data communication and american sign language translator glove for images text and speech display on mobile phone. In: 5th International Conference on Communication Systems and Network Technologies, pp. 578–585 (2015)
Google Scholar
Prasuhn, L., Oyamada, Y., Mochizuki, Y., Ishikawa, H.: A HOG-based hand gesture recognition system on a mobile device. IEEE International Conference on Image Processing, pp. 3973–3977 (2014)
Google Scholar
Sharma, R.P., Verma, G.K.: Human computer interaction using hand gesture. Proc. Comput. Sci. 54, 721–727 (2015)
Article Google Scholar
Kosmidou, V.E., Hadjileontiadis, L.J.: Sign language recognition using intrinsic-mode sample entropy on sEMG and accelerometer data. IEEE Trans. Biomed. Eng. 56(12), 2879–2890 (2009)
Article Google Scholar
Rao, G.A., Kishore, P.: Selfie video based continuous Indian sign language recognition system. Ain Shams Eng. J. 9(4), 1929–1939 (2018)
Article Google Scholar
Celebi, S., Aydin, A.S., Temiz, T.T., Arici, T.: Gesture recognition using skeleton data with weighted dynamic time warping. In: International Conference on Computer Vision Theory and Application, pp. 620–625 (2013)
Google Scholar
Kapuscinski, T.: Using hierarchical temporal memory for vision-based hand shape recognition under large variations in hand’s rotation. In: Rutkowski, L., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) Artifical Intelligence and Soft Computing. LNCS, vol. 6114, pp. 272–279. Springer, Berlin, Heidelberg (2010)
Chapter Google Scholar
Hakkun, R.Y., Baharuddin, A.: Sign language learning based on android for deaf and speech impaired people. In: International Electronics Symposium, pp. 114–117 (2015)
Google Scholar
Elons, A., Ahmed, M., Shedid, H.: Facial expressions recognition for Arabic sign language translation. In: 9th International Conference on Computer Engineering & Systems, pp. 330–335 (2014)
Google Scholar
Madhuri, Y., Anitha, G., Anburajan, M. Vision-based sign language translation device. In: International Conference Information Communication and Embedded Systems, pp. 565–568 (2013)
Google Scholar
Luo, R.C., Wu, Y., Lin, P.: Multimodal information fusion for human-robot interaction. In: 10th Jubilee International Symposium on Applied Computational Intelligence and Informatics, pp. 535–540 (2015)
Google Scholar
Almeida, S.G.M., Guimaraes, F.G., Ramírez, J.A.: Feature extraction in Brazilian sign language recognition based on phonological structure and using RGB-D sensors. Expert Syst. Appl. 41(16), 7259–7271 (2014)
Article Google Scholar
Kau, L.J., Su, W.L., Yu, P.J., Wei, S.J.: A real-time portable sign language translation system. In: 58th International Midwest Symposium on Circuits and Systems, pp. 1–4 (2015)
Google Scholar
Paulson, B., Cummings, D., Hammond, T.: Object interaction detection using hand posture cues in an office setting. Int. J. Hum. Comput. Stud. 69(1), 19–29 (2011)
Article Google Scholar
Devi, S., Deb, S.: Low cost tangible glove for translating sign gestures to speech and text in Hindi language. In: 3rd International Conference on Computational Intelligence & Communication Technology, pp. 1–5 (2017)
Google Scholar
Pisharady, P.K., Saerbeck, M.: Recent methods and databases in vision-based hand gesture recognition: a review. Comput. Vis. Image Underst. 141, 152–165 (2015)
Article Google Scholar
Pugeault, N., Bowden, R.: Spelling it out: real-time ASL finger-spelling recognition. In: 1st IEEE Workshop on Consumer Depth Cameras for Computer Vision, pp. 1114–1119 (2011)
Google Scholar
Kawulok, M., Kawulok, J., Nalepa, J.: Spatial-based skin detection using discriminative skin-presence features. Pattern Recognit. 41, 3–13 (2014)
Article Google Scholar
Kollorz, E., Penne, J., Hornegger, J., Barke, A.: Gesture recognition with a time-of-flight camera. Int. J. Intell. Syst. Technol. Appl. 5, 334–343 (2007)
Article Google Scholar
Pisharady, P.K., Vadakkepat, P., Loh, A.P.: Attention based detection and recognition of hand postures against complex backgrounds. Int. J. Comput. Vis. 101(3), 403–419 (2013)
Article Google Scholar
Chuang, Y., Chen, L., Chen, G.: Saliency-guided improvement for hand posture detection and recognition. Neurocomputing 133, 404–415 (2014)
Article Google Scholar
Zhou, R., Junsong, Y., Zhengyou, Z.: Robust hand gesture recognition based on finger-earth movers distance with a commodity depth camera. In: ACM International Conference on Multimedia, pp. 1093–1096 (2011)
Google Scholar
Malgireddy, M.R., Inwogu, I., Govindaraju, V.: A temporal Bayesian model for classifying, detecting and localizing activities in video sequences. In: IEEE Computer Vision and Pattern Recognition, Workshops, pp. 43–48 (2012)
Google Scholar
Keskin, C., Kirac, F., Kara, Y., Akarun, L.: Randomized decision forests for static and dynamic hand shape classification. In: IEEE Computer Vision and Pattern Recognition, Workshops, pp. 31–46 (2012)
Google Scholar
Simon, F., Helena, M.M., Pushmeet, K., Sebastian, N.: Instructing people for training gestural interactive systems. In: International Conference on Human Factors in Computing Systems, pp. 1737–1746 (2012)
Google Scholar
Escalera, S., Gonzalez, J., Baro, X., Reyes, M., Lopes, O., Guyon, I., Athistos, V., Escalante, H.J.: Multi-modal gesture recognition change 2013: dataset and results. In: 15th ACM International Conference on Multimodal Interaction, pp. 445–452 (2013)
Google Scholar
Chen, M., Al Regib, G., Juang, B. H.: 6DMG: a new 6D motion gesture database. In: IEEE Computer Vision and Pattern Recognition Workshops, pp. 83–88 (2011)
Google Scholar
Liu, L., Shao, L.: Learning discriminative representations from RGB-D video data. Int. J. Artif. Intell. 1493–1500 (2013)
Google Scholar
Ryumin, D., Ivanko, D., Axyonov, A., Kagirov, I., Karpov, A., Zelezny, M.: Human-robot interaction with smart shopping trolley using sign language: data collection. In: IEEE International Conference Pervasive Computing and Communications Workshops, pp. 1–6 (2019)
Google Scholar
Yale, S., David, D., Randall, D.: Tracking body and hands for gesture recognition: NATOPS aircraft handling signals database. In: IEEE International Conference Automatic Face and Gesture Recognition, pp. 500–506 (2011)
Google Scholar
Shen, X.H., Hua, G., Williams, L., Wu, Y.: Dynamic hand gesture recognition: an exemplar-based approach from motion divergence fields. Image Vis. Comput. 30(3), 227–235 (2012)
Article Google Scholar
Kim, T.K., Wong, S.F., Cipolla, R.: Tensor canonical correlation analysis for action classification. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2007)
Google Scholar
Zhang, Z.: Microsoft kinect sensor and its effect. IEEE Multimed. 19(2), 4–10 (2012)
Article Google Scholar
Rahman, M.W., Gavrilova, M.L.: Kinect gait skeletal joint feature-based person identification. In: IEEE 16th International Conference on Cognitive Informatics & Cognitive Computing, pp. 423–430 (2017)
Google Scholar
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 511–518 (2001)
Google Scholar
Pulli, K., Baksheev, A., Kornyakov, K., Eruhimov, V.: Real-time computer vision with OpenCV. Commun. ACM 55(6), 61–69 (2012)
Article Google Scholar
OpenCV: OpenCV library. https://opencv.org. Accessed 7 Aug 2019
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: International Conference on Computer Vision, pp. 886–893 (2005)
Google Scholar
King, D.E.: Dlib-ml: a machine learning toolkit. J. Mach. Learn. Res. 10, 1755–1758 (2009)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: MobileNetV2: inverted residuals and linear bottlenecks. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 4510–4520 (2018)
Google Scholar
Hochreiter, S., Schmidhuber, L.: Long short-term memory. Neural Comput. 9, 1–32 (1997)
Article Google Scholar
Donahue, J., Hendricks, A.L., Guadarrama, S., Rohrbach. M., Venugopalan, S., Darrell, T., Saenko, K.: Long-term recurrent convolutional networks for visual recognition and description. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)
Google Scholar
Geilman, I.: Russian sign language dictionary. In: vol. 2, St. Petersburg, Prana (2004)
Google Scholar
Viola, P., Jones, M.: Robust real-time face detection. Int. J. Comput. Vis. 57(2), 137–154 (2004)
Article Google Scholar
Castrillyn, M., Deniz, O., Hernandez, D., Lorenzo, J.: A comparison of face and facial feature detectors based on the Viola-Jones general object detection framework. Mach. Vis. Appl. 22(3), 481–494 (2011)
Google Scholar
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: Thirteen International Conference on Machine Learning, vol. 96, pp. 148–156 (1996)
Google Scholar
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C-Y., Berg, A.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds) Computer Vision—ECCV 2016: European Conference on Computer Vision. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016)
Chapter Google Scholar
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: 2nd ACM International Conference on Multimedia, pp. 675–678 (2014)
Google Scholar
Caffe: a fast open framework for deep learning (2019). http://caffe.berkeleyvision.org. Accessed 7 Aug 2019
Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A., Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard, M., Kudlur, M.: Tensorflow: a system for large-scale machine learning. In: 12th Symposium on Operating Systems Design and Implementation, pp. 265–283 (2016)
Google Scholar
TensorFlow: an end-to-end open source machine learning platform. https://www.tensorflow.org. Accessed 7 Aug 2019
Déniz, O., Bueno, G., Salido, J., De la Torre, F.: Face recognition using histograms of oriented gradients. Pattern Recognit. Lett. 32(12), 1598–1603 (2011)
Article Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27 (2011)
Article Google Scholar
Dlib: toolkit containing machine learning algorithms and tools for creating complex software. https://www.dlib.net. Accessed 7 Aug 2019
King, D.E.: Max-margin object detection (2015). arXiv:1502.00046
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Everingham, M., Van Gool, L., Williams, C.K., Winn, J., Zisserman, A.: The pascal visual object classes (VOC) challenge. Int. J. Comput. Vis. 88(2), 303–338 (2010)
Article Google Scholar
Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: British Machine Vision Conference, pp. 41.1–41.12 (2015)
Google Scholar
Yang, S., Luo, P., Loy, C.C., Tang, X.: Wider face: a face detection benchmark. IEEE Conference on Computer Vision and Pattern Recognition, pp. 5525–5533 (2016)
Google Scholar
Ng, H.W., Winkler, S.: A data-driven approach to cleaning large face datasets. In: IEEE International Conference on Image Processing, pp. 343–347 (2014)
Google Scholar
Huang, J., Rathod, V., Sun, Ch., Zhu, M., Korattikara, A., Fathi, A., Fischer, I., Wojna, Z., Song, Y., Guadarrama, S., Murphy, K.: Speed/accuracy trade-offs for modern convolutional object detectors. In: 30th IEEE Conference on Computer Vision and Pattern Recognition, pp. 3296–3297 (2017)
Google Scholar
LabelImg: A graphical image annotation tool. https://github.com/tzutalin/labelImg. Accessed 7 Aug 2019
Chollet, F.: Keras. https://keras.io. Accessed 7 Aug 2019

Download references

Acknowledgements

This research was financially supported by the Ministry of Science and Higher Education of Russia, agreement No. 075-15-2019-1295 (reference RFMEFI61618X0095).

Author information

Authors and Affiliations

St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, SPIIRAS, 39, 14th Line, 199178, St. Petersburg, Russian Federation
Dmitry Ryumin, Denis Ivanko, Ildar Kagirov, Alexander Axyonov & Alexey Karpov

Authors

Dmitry Ryumin
View author publications
You can also search for this author in PubMed Google Scholar
Denis Ivanko
View author publications
You can also search for this author in PubMed Google Scholar
Ildar Kagirov
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Axyonov
View author publications
You can also search for this author in PubMed Google Scholar
Alexey Karpov
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dmitry Ryumin .

Editor information

Editors and Affiliations

Department of Informatics and Computer Techniques, Institute of Informatics and Telecommunications, Reshetnev Siberian State University of Science and Technology, Krasnoyarsk, Russia
Margarita N. Favorskaya
University of Canberra, Canberra, SA, Australia
Lakhmi C. Jain

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Ryumin, D., Ivanko, D., Kagirov, I., Axyonov, A., Karpov, A. (2020). Vision-Based Assistive Systems for Deaf and Hearing Impaired People. In: Favorskaya, M., Jain, L. (eds) Computer Vision in Advanced Control Systems-5. Intelligent Systems Reference Library, vol 175. Springer, Cham. https://doi.org/10.1007/978-3-030-33795-7_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-33795-7_7
Published: 08 December 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33794-0
Online ISBN: 978-3-030-33795-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics