Detecting Hand-Head Occlusions in Sign Language Video

  • Ville Viitaniemi
  • Matti Karppa
  • Jorma Laaksonen
  • Tommi Jantunen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7944)


A large body of current linguistic research on sign language is based on analyzing large corpora of video recordings. This requires either manual or automatic annotation of the videos. In this paper we introduce methods for automatically detecting and classifying hand-head occlusions in sign language videos. Linguistically, hand-head occlusions are an important and interesting subject of study as the head is a structural place of articulation in many signs. Our method combines easily calculable local video properties with more global hand tracking. The experiments carried out with videos of the Suvi on-line dictionary of Finnish Sign Language show that the sensitivity of the proposed local method in detecting occlusion events is 92.6%. When global hand tracking is combined in the method, the specificity can reach the level of 93.7% while still maintaining the detection sensitivity above 90%.


Extreme Learn Machine Sign Language Sign Language Video Head Region Facial Landmark 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Crasborn, O., Zwitserlood, I.: Annotation of the video data in the “Corpus NGT” (2008), Online publication,; Dept. of Linguistics, and Centre for Language Studies, Radboud University Nijmegen, The Netherlands
  2. 2.
    Dreuw, P., Forster, J., Deselaers, T., Ney, H.: Efficient approximations to model-based joint tracking and recognition of continuous sign language. In: IEEE International Conference on Automatic Face and Gesture Recognition, Amsterdam, The Netherlands, pp. 1–6 (September 2008)Google Scholar
  3. 3.
    Dreuw, P., Neidle, C., Athitsos, V., Sclaroff, S., Ney, H.: Benchmark databases for video-based automatic sign language recognition. In: LREC. European Language Resources Association (2008)Google Scholar
  4. 4.
    Han, J., Awad, G., Sutherland, A.: Automatic skin segmentation and tracking in sign language recognition. IET Computer Vision 3(1), 24–35 (2009)CrossRefGoogle Scholar
  5. 5.
    Jantunen, T., Koskela, M., Laaksonen, J., Rainò, P.: Towards automated visualization and analysis of signed language motion: Method and linguistic issues. In: Proceedings of 5th International Conference on Speech Prosody, Chicago, Ill, USA (2010),
  6. 6.
    Jantunen, T., Viitaniemi, V., Karppa, M., Laaksonen, J.: The head as a place of articulation: From automated detection to linguistic analysis. In: Poster accepted for presentation at 11th Theoretical Issues in Sign Language Research Conference, University College London, July 10-13 (2013)Google Scholar
  7. 7.
    Johnston, T.: Guidelines for annotation of the video data in the Auslan corpus, Dept. of Linguistics, Macquarie University, Sydney, Australia (2009), Online publication,
  8. 8.
    Karppa, M., Jantunen, T., Koskela, M., Laaksonen, J., Viitaniemi, V.: Method for visualisation and analysis of hand and head movements in sign language video. In: Kirchhof, C., Malisz, Z., Wagner, P. (eds.) Proceedings of the 2nd Gesture and Speech in Interaction Conference (GESPIN 2011), Bielefeld, Germany (2011),
  9. 9.
    Karppa, M., Jantunen, T., Viitaniemi, V., Laaksonen, J., Burger, B., De Weerdt, D.: Comparing computer vision analysis of signed language video with motion capture recordings. In: Proceedings of 8th Language Resources and Evaluation Conference (LREC 2012), Istanbul, Turkey, pp. 2421–2425 (May 2012),
  10. 10.
    Uřičář, M., Franc, V., Hlaváč, V.: Detector of facial landmarks learned by the structured output SVM. In: Csurka, G., Braz, J. (eds.) VISAPP 2012: Proceedings of the 7th International Conference on Computer Vision Theory and Applications, Portugal, vol. 1, pp. 547–556. SciTePress — Science and Technology Publications (2012)Google Scholar
  11. 11.
    Yang, H.-D., Sclaroff, S., Lee, S.-W.: Sign language spotting with a threshold model based on conditional random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(7), 1264–1277 (2009)CrossRefGoogle Scholar
  12. 12.
    Yang, R., Sarkar, S., Loeding, B.: Handling movement epenthesis and hand segmentation ambiguities in continuous sign language recognition using nested dynamic programming. IEEE Transactions on Pattern Analysis and Machine Intelligence 32(3), 462–477 (2010)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Ville Viitaniemi
    • 1
  • Matti Karppa
    • 1
  • Jorma Laaksonen
    • 1
  • Tommi Jantunen
    • 2
  1. 1.Department of Information and Computer ScienceAalto University School of ScienceEspooFinland
  2. 2.Sign Language Centre, Department of LanguagesUniversity of JyväskyläFinland

Personalised recommendations