Recognition of Spatiotemporal Gestures in Sign Language Using Gesture Threshold HMMs

  • Daniel Kelly
  • John McDonald
  • Charles Markham
Part of the Advances in Pattern Recognition book series (ACVPR)


In this paper, we propose a framework for the automatic recognition of spatiotemporal gestures in Sign Language. We implement an extension to the standard HMM model to develop a gesture threshold HMM (GT-HMM) framework which is specifically designed to identify inter gesture transitions. We evaluate the performance of this system, and different CRF systems, when recognizing gestures and identifying inter gesture transitions. The evaluation of the system included testing the performance of conditional random fields (CRF), hidden CRF (HCRF) and latent-dynamic CRF (LDCRF) based systems and comparing these to our GT-HMM based system when recognizing motion gestures and identifying inter gesture transitions.


Hide Markov Model Sign Language Gesture Recognition Hand Gesture Conditional Random Field 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Assan, M., Grobel, K.: Video-based sign language recognition using hidden Markov models. In: Proc. of the International Gesture Workshop on Gesture and Sign Language in Human-Computer Interaction, pp. 97–109. Springer, London (1998) CrossRefGoogle Scholar
  2. 2.
    Bauer, B., Kraiss, K.F.: Towards an automatic sign language recognition system using subunits. In: GW’01: Revised Papers from the International Gesture Workshop on Gesture and Sign Languages in Human-Computer Interaction, pp. 64–75. Springer, London (2002) CrossRefGoogle Scholar
  3. 3.
    Bengio, Y., Frasconi, P.: An input/output HMM architecture. Adv. Neural Inf. Process. Syst. 7, 427–434 (1995) Google Scholar
  4. 4.
    Bernier, O., Collobert, D.: Head and hands 3d tracking in real time by the EM algorithm. In: RATFG-RTS’01: Proc. of the IEEE ICCV Workshop on Recognition, Analysis, and Tracking of Faces and Gestures in Real-Time Systems, p. 75. IEEE Comput. Soc., Washington (2001) CrossRefGoogle Scholar
  5. 5.
    Brashear, H., Starner, T., Lukowicz, P., Junker, H.: Using multiple sensors for mobile sign language recognition. In: Proc. of the 7th IEEE International Symposium on Wearable Computers, pp. 45–52 (2003). doi: 10.1109/ISWC.2003.1241392.
  6. 6.
    Castrillon-Santana, M., Deniz-Suarez, O., Anton-Canalis, L., Lorenzo-Navarro, J.: Performance evaluation of public domain haar detectors for face and facial feature detection. In: VISAPP (2008) Google Scholar
  7. 7.
    Comaniciu, D., Ramesh, V., Meer, P.: Real-time tracking of non-rigid objects using mean shift. In: Proc. of the 2nd Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 142–149 (2000). doi: 10.1109/CVPR.2000.854761
  8. 8.
    Cooper, H., Bowden, R.: Large lexicon detection of sign language. In: CVHCI07, pp. 88–97 (2007) Google Scholar
  9. 9.
    Cootes, T., Taylor, C.: Statistical models of appearance for computer vision. Technical Report, Wolfson Image Analysis Unit, Imaging Science and Biomedical Engineering, University of Manchester, Manchester M13 9PT (2001) Google Scholar
  10. 10.
    Ding, L., Martinez, A.: Modelling and recognition of the linguistic components in American sign language. J. Image Vis. Comput. (2), 257–286 (2009, in press). doi: 10.1109/5.18626
  11. 11.
    Gao, W., Fang, G., Zhao, D., Chen, Y.: Transition movement models for large vocabulary continuous sign language recognition. In: IEEE FG 2004, pp. 553–558 (2004). doi: 10.1109/AFGR.2004.1301591
  12. 12.
    Grossman, R.B., Kegl, J.: To capture a face: a novel technique for the analysis and quantification of facial expressions in American sign language (2006) Google Scholar
  13. 13.
    Holden, E.J., Robyn, O.: Visual sign language recognition. Mutli-Image Anal. 2032(6), 270–287 (2001). doi: 10.1109/TPAMI.2005.112 Google Scholar
  14. 14.
    Holden, E.J., Lee, G., Owens, R.: Australian sign language recognition. Mach. Vis. Appl. 16(5), 312–320 (2005). doi: 10.1007/s00138-005-0003-1 CrossRefGoogle Scholar
  15. 15.
    Junker, H., Amft, O., Lukowicz, P., Tröster, G.: Gesture spotting with body-worn inertial sensors to detect user activities. Pattern Recognit. 41(6), 2010–2024 (2008). doi: 10.1016/j.patcog.2007.11.016 MATHCrossRefGoogle Scholar
  16. 16.
    Just, A., Marcel, S.: A comparative study of two state-of-the-art sequence processing techniques for hand gesture recognition. Comput. Vis. Image Underst. 113(4), 532–543 (2009). doi: 10.1016/j.cviu.2008.12.001. CrossRefGoogle Scholar
  17. 17.
    Kim, Y.J., Conkie, A.: Automatic segmentation combining an HMM-based approach and spectral boundary correction. In: ICSLP-2002, pp. 145–148 (2002) Google Scholar
  18. 18.
    Kim, J., Wagner, J., Rehm, M., Andre, E.: Bi-channel sensor fusion for automatic sign language recognition. In: 8th IEEE International Conference on Automatic Face & Gesture Recognition, 2008, pp. 1–6 (2008). doi: 10.1109/AFGR.2008.4813341.
  19. 19.
    Lee, H.K., Kim, J.H.: An HMM-based threshold model approach for gesture recognition. IEEE Trans. Pattern Anal. Mach. Intell. 21(10), 961–973 (1999). doi: 10.1109/34.799904 CrossRefGoogle Scholar
  20. 20.
    Liddell, S.K., Johnson, R.: American sign language: the phonological base. Sign Lang. Stud. 64(6), 195–278 (1989). doi: 10.1109/TPAMI.2005.112 Google Scholar
  21. 21.
    Morency, L.P., Quattoni, A., Darrell, T.: Latent-dynamic discriminative models for continuous gesture recognition. In: CVPR, pp. 1–8 (2007). doi: 10.1109/CVPR.2007.383299
  22. 22.
    Ong, S.C.W., Ranganath, S.: Automatic sign language analysis: a survey and the future beyond lexical meaning. IEEE Trans. Pattern Anal. Mach. Intell. 27(6), 873–891 (2005). doi: 10.1109/TPAMI.2005.112 CrossRefGoogle Scholar
  23. 23.
    Oz, C., Leu, M.: Linguistic properties based on American sign language isolated word recognition with artificial neural networks using a sensory glove and motion tracker. Neurocomputing 70(16–18), 2891–2901 (2007). doi: 10.1016/j.neucom.2006.04.016. CrossRefGoogle Scholar
  24. 24.
    Rabiner, L.: A tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989). doi: 10.1109/5.18626 CrossRefGoogle Scholar
  25. 25.
    Shanableh, T., Assaleh, K., Al-Rousan, M.: Spatio-temporal feature-extraction techniques for isolated gesture recognition in Arabic sign language. IEEE Trans. Syst. Man Cybern. Part B, Cybern. 37(3), 641–650 (2007). doi: 10.1109/TSMCB.2006.889630 CrossRefGoogle Scholar
  26. 26.
    Starner, T., Pentland, A., Weaver, J.: Real-time American sign language recognition using desk and wearable computer based video. IEEE Trans. Pattern Anal. Mach. Intell. 20(12), 1371–1375 (1998). doi: 10.1109/34.735811 CrossRefGoogle Scholar
  27. 27.
    Stokoe, W.C.: Sign language structure: an outline of the visual communication systems of the American deaf. J. Deaf Stud. Deaf Educ. 10(1), 3–37 (2005) CrossRefGoogle Scholar
  28. 28.
    Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR, IEEE, vol. 1, p. 511 (2001).
  29. 29.
    Vogler, C., Metaxas, D.: Parallel hidden Markov models for American sign language recognition. In: ICCV, pp. 116–122 (1999) Google Scholar
  30. 30.
    Vogler, C., Metaxas, D.: A framework for recognizing the simultaneous aspects of American sign language. Comput. Vis. Image Underst. 81, 358–384 (2001) MATHCrossRefGoogle Scholar
  31. 31.
    Vogler, C., Metaxas, D.: Handshapes and movements: multiple-channel ASL recognition. In: Gesture-Based Communication in Human-Computer Interaction. Lecture Notes in Computer Science, pp. 1–13. Springer, Berlin (2004) Google Scholar
  32. 32.
    von Agris, U., Schneider, D., Zieren, J., Kraiss, K.F.: Rapid signer adaptation for isolated sign language recognition. In: CVPRW’06: Proc. of the 2006 Conference on Computer Vision and Pattern Recognition Workshop, p. 159. IEEE Comput. Soc., Washington (2006). doi: 10.1109/CVPRW.2006.165 CrossRefGoogle Scholar
  33. 33.
    Wang, C., Shan, S., Gao, W.: An approach based on phonemes to large vocabulary Chinese sign language recognition. In: IEEE FG 2002, p. 411. IEEE Comput. Soc., Washington (2002) Google Scholar
  34. 34.
    Wang, S.B., Quattoni, A., Morency, L.P., Demirdjian, D., Darrell, T.: Hidden conditional random fields for gesture recognition. In: CVPR, pp. 1521–1527 (2006). doi: 10.1109/CVPR.2006.132
  35. 35.
    Yamato, J., Ohya, J., Ishii, K.: Recognizing human action in time-sequential images using hidden Markov model. In: CVPR, pp. 379–385 (1992). doi: 10.1109/CVPR.1992.223161
  36. 36.
    Yang, R., Sarkar, S., Loeding, B.: Enhanced level building algorithm for the movement epenthesis problem in sign language recognition. In: CVPR07, pp. 1–8 (2007) Google Scholar
  37. 37.
    Yang, H.D., Sclaroff, S., Lee, S.W.: Sign language spotting with a threshold model based on conditional random fields. IEEE Trans. Pattern Anal. Mach. Intell. 99(1) (2009) Google Scholar
  38. 38.
    Yang, R., Sarkar, S., Loeding, B.: Handling movement epenthesis and hand segmentation ambiguities in continuous sign language recognition using nested dynamic programming. IEEE Trans. Pattern Anal. Mach. Intell. 32(3), 462–477 (2010). doi: 10.1109/TPAMI.2009.26 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Limited 2011

Authors and Affiliations

There are no affiliations available

Personalised recommendations