Multiple Proposals for Continuous Arabic Sign Language Recognition
- 268 Downloads
The deaf community relies on sign language as the primary means of communication. For the millions of people around the world who suffer from hearing loss, interaction with hearing people is quite difficult. The main objective of sign language recognition (SLR) is the development of automatic SLR systems to facilitate communication with the deaf community. Arabic SLR (ArSLR) specifically did not receive much attention until recent years. This work presents a comprehensive comparison between two different recognition techniques for continuous ArSLR, namely a Modified k-Nearest Neighbor which is suitable for sequential data and Hidden Markov Models (HMMs) techniques based on two different toolkits. Additionally, in this work, two new ArSL datasets composed of 40 Arabic sentences are collected using Polhemus G4 motion tracker and a camera. An existing glove-based dataset is employed in this work as well. The three datasets are made publicly available to the research community. The advantages and disadvantages of each data acquisition approach and classification technique are discussed in this paper. In the experimental results section, it is shown that classification accuracy for sign sentences acquired using a motion tracker are very similar the classification accuracy for sentences acquired using sensor gloves. The modified KNN solution is inferior to HMMs in terms of the computational time required for classification.
KeywordsArabic sign language recognition Pattern classification Feature extraction Motion detectors
The authors gratefully acknowledge the American University of Sharjah for supporting this research through Grant FRG14-2-26.
- 2.Dgs-corpus. (2015). http://www.sign-lang.uni-hamburg.de/dgs-korpus/.
- 3.Dictasign project. (2016). http://www.sign-lang.uni-hamburg.de/dicta-sign.
- 4.Bsl corpus project. (2016). http://www.bslcorpusproject.org/.
- 5.Yang, R., & Sarkar, S. (2006). Detecting coarticulation in sign language using conditional random fields. In 18th international conference on pattern recognition (ICPR’06) (Vol. 2, pp. 108–112).Google Scholar
- 6.Yang, R., Sarkar, S., & Loeding, B. (2007). Enhanced level building algorithm for the movement epenthesis problem in sign language recognition. In IEEE conference on computer vision and pattern recognition (pp. 1–8).Google Scholar
- 8.Cooper, H., Holt, B., & Bowden, R. (2011). Sign language recognition. In Visual analysis of humans (pp. 539–562). London: Springer.Google Scholar
- 12.Al-Rousan, M., & Hussain, M. (2001). Automatic recognition of Arabic sign language finger spelling. International Journal of Computers and Their Applications, 8, 80–88.Google Scholar
- 14.Uebersax, D., Gall, J., den Bergh, M. V., & Gool, L. V. (2011). Real-time sign language letter and word recognition from depth data. In IEEE international conference on computer vision workshops (ICCV Workshops) (pp. 383–390).Google Scholar
- 17.Gweth, Y. L., Plahl, C., & Ney, H. (2012). Enhanced continuous sign language recognition using PCA and neural network features. In IEEE computer society conference on computer vision and pattern recognition workshop (pp. 55–60).Google Scholar
- 18.Forster, J., Oberdörfer, C., Koller, O., & Ney, H. (2013). Modality combination techniques for continuous sign language recognition. In Pattern recognition and image analysis. IbPRIA 2013. Lecture notes in computer science (Vol. 7887, pp. 89–99). Berlin, Heidelberg: Springer.Google Scholar
- 19.Koller, O., Zargaran, O., Ney, H., & Bowden, R. (2016). Deep sign: Hybrid CNN-HMM for continuous sign language recognition. In British machine vision conference.Google Scholar
- 20.Pu, J., Zhou, W., Zhang, J., & Li, H. (2016). Sign language recognition based on trajectory modeling with HMMs. In Multimedia modeling. MMM 2016. Lecture notes in computer science (Vol. 9516, pp. 686–697). Cham: Springer.Google Scholar
- 22.Kong, W. W., & Ranganath, S. (2008). Automatic hand trajectory segmentation and phoneme transcription for sign language. In 8th IEEE international conference on automatic face & gesture recognition (pp. 1–6). Netherlands.Google Scholar
- 26.Chai, X., Li, G., Lin, Y., Xu, Z., Tang, Y., Chen, X., & Zhou, M. (2013). Sign language recognition and translation with Kinect.Google Scholar
- 27.Chen, X., et al. (2013). Kinect sign language translator expands communication possibilities.Google Scholar
- 28.Zafrulla, Z., Brashear, H., Starner, T., Hamilton, H., & Presti, P. (2011). American sign language recognition with the Kinect. In Proceedings of the 13th international conference on multimodal interfaces (pp. 279–286). Spain.Google Scholar
- 29.Lang, S., Block, M., & Rojas, R. (2012). Sign language recognition using Kinect. In Artificial intelligence and soft computing. ICAISC 2012. Lecture notes in computer science (Vol. 7267, pp. 394–402). Berlin: Springer.Google Scholar
- 32.Elhenawy, I., & Khamiss, A. (2014). The design and implementation of mobile Arabic fingerspelling recognition system. International Journal of Computer Science and Network Security (IJCSNS), 14(2), 149.Google Scholar
- 35.Tuffaha, M., Shanableh, T., & Assaleh, K. (2015). Novel feature extraction and classification technique for sensor-based continuous Arabic sign language recognition, pp. 290–299.Google Scholar
- 36.Walker, W., Lamere, P., Kwok, P., Raj, B., Singh, R., Gouvea, E., et al. (2004). Sphinx-4: A flexible open source framework for speech recognition. Mountain View, California: Sun Microsystems, Inc.Google Scholar
- 37.Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., et al. (2002). The HTK book (Vol. 3, p. 175). Cambridge: Cambridge University Engineering Department.Google Scholar
- 38.Lee, A., Kawahara, T., & Shikano, K. (2001). Julius—An open source real-time large vocabulary recognition engine. In European conference on speech communication and technology (EUROSPEECH).Google Scholar
- 39.Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., et al. (2011). The kaldi speech recognition toolkit, no. EPFL-CONF-192584.Google Scholar
- 40.Rybach, D., Gollan, C., Heigold, G., Hoffmeister, B., Lööf, J., Schlüter, R., et al. (2009). The RWTH AACHEN university open source speech recognition system. In 10th annual conference of the international speech communication association (pp. 2111–2114). Brighton, UK.Google Scholar
- 41.Westeyn, T., Brashear, H., Atrash, A., & Starner, T. (2003). Georgia tech gesture toolkit: Supporting experiments in gesture recognition. In 5th international conference on multimodal interfaces (pp. 85–92). New York.Google Scholar
- 42.Dreuw, P., Rybach, D., Deselaers, T., Zahedi, M., & Ney, H. (2007). Speech recognition techniques for a sign language recognition system. In 8th annual conference of the international speech communication association (p. 80). Belgium.Google Scholar
- 43.Dreuw, P., Rybach, D., Heigold, G., & Ney, H. (2012). RWTH OCR: A large vocabulary optical character recognition system for Arabic scripts. In Guide to OCR for Arabic scripts (pp. 215–254). London: Springer.Google Scholar
- 44.Gillian, N., & Paradiso, J. A. (2014). The gesture recognition toolkit. The Journal of Machine Learning Research, 15(1), 3483–3487.Google Scholar
- 45.Lööf, J., Gollan, C., Hahn, S., Heigold, G., Hoffmeister, B., Plahl, C., et al. (2007). The RWTH 2007 TC-STAR evaluation system for European English and Spanish. In 8th annual conference of the international speech communication association (pp. 2145–2148). Belgium.Google Scholar
- 46.Rybach, D., Hahn, S., Gollan, C., Schluter, R., & Ney, H. (2007). Advances in Arabic broadcast news transcription at RWTH. In IEEE workshop on automatic speech recognition & understanding (ASRU) (pp. 449–454). Koyoto, Japan.Google Scholar
- 47.Sundermeyer, M., Nußbaum-Thom, M., Wiesler, S., Plahl, C., Mousa, A. E.-D., Hahn, S., et al. (2011). The RWTH 2010 Quaero ASR evaluation system for English, French, and German. In IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 2212–2215). Prague, Czech Republic.Google Scholar
- 48.Plahl, C., Hoffmeister, B., Hwang, M., Lu, D., Heigold, G., Lööf, J., et al. (2008). Recent improvements of the RWTH GALE mandarin LVCSR system. In 9th annual conference of the international speech communication association (pp. 2426–2429). Brisbane, Australia.Google Scholar
- 49.Povey, D., & Woodland, P. C. (2002). Minimum phone error and i-smoothing for improved discriminative training. In IEEE international conference on acoustics, speech, and signal processing (pp. I-105). Orlando, FL, USA.Google Scholar
- 50.RASR manual. (2017). http://www.hltpr.rwth-aachen.de/rasr/manual