Hand Tracking and Affine Shape-Appearance Handshape Sub-units in Continuous Sign Language Recognition

  • Anastasios Roussos
  • Stavros Theodorakis
  • Vassilis Pitsikalis
  • Petros Maragos
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6553)


We propose and investigate a framework that utilizes novel aspects concerning probabilistic and morphological visual processing for the segmentation, tracking and handshape modeling of the hands, which is used as front-end for sign language video analysis. Our ultimate goal is to explore the automatic Handshape Sub-Unit (HSU) construction and moreover the exploitation of the overall system in automatic sign language recognition (ASLR). We employ probabilistic skin color detection followed by the proposed morphological algorithms and related shape filtering for fast and reliable segmentation of hands and head. This is then fed to our hand tracking system which emphasizes robust handling of occlusions based on forward-backward prediction and incorporation of probabilistic constraints. The tracking is exploited by an Affine-invariant Modeling of hand Shape-Appearance images, offering a compact and descriptive representation of the hand configurations. We further propose that the handshape features extracted via the fitting of this model are utilized to construct in an unsupervised way basic HSUs. We first provide intuitive results on the HSU to sign mapping and further quantitatively evaluate the integrated system and the constructed HSUs on ASLR experiments at the sub-unit and sign level. These are conducted on continuous SL data from the BU400 corpus and investigate the effect of the involved parameters. The experiments indicate the effectiveness of the overall approach and especially for the modeling of handshapes when incorporated in the HSU-based framework showing promising results.


Sign Language American Sign Active Appearance Model Vocabulary Size Hand Gesture Recognition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Bowden, R., Windridge, D., Kadir, T., Zisserman, A., Brady, M.: A Linguistic Feature Vector for the Visual Interpretation of Sign Language. In: Pajdla, T., Matas, J. (eds.) ECCV 2004, Part I. LNCS, vol. 3021, pp. 390–401. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  2. 2.
    Argyros, A.A., Lourakis, M.I.A.: Real-Time Tracking of Multiple Skin-Colored Objects with a Possibly Moving Camera. In: Pajdla, T., Matas, J. (eds.) ECCV 2004, Part III. LNCS, vol. 3023, pp. 368–379. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  3. 3.
    Buehler, P., Everingham, M., Zisserman, A.: Learning sign language by watching TV (using weakly aligned subtitles). In: CVPR, pp. 2961–2968 (2009)Google Scholar
  4. 4.
    Liwicki, S., Everingham, M.: Automatic recognition of fingerspelled words in British sign language. In: Proc. of CVPR4HB (2009)Google Scholar
  5. 5.
    Hu, M.K.: Visual pattern recognition by moment invariants. IRE Transactions on Information Theory 8, 179–187 (1962)zbMATHGoogle Scholar
  6. 6.
    Conseil, S., Bourennane, S., Martin, L.: Comparison of Fourier descriptors and Hu moments for hand posture recognition. In: EUSIPCO (2007)Google Scholar
  7. 7.
    Birk, H., Moeslund, T., Madsen, C.: Real-time recognition of hand alphabet gestures using principal component analysis. In: Proc. SCIA (1997)Google Scholar
  8. 8.
    Wu, Y., Huang, T.: View-independent recognition of hand postures. In: CVPR, vol. 2, pp. 88–94 (2000)Google Scholar
  9. 9.
    Huang, C.L., Jeng, S.H.: A model-based hand gesture recognition system. Machine Vision and Application 12, 243–258 (2001)CrossRefGoogle Scholar
  10. 10.
    Cootes, T., Taylor, C.: Statistical models of appearance for computer vision. Technical report, University of Manchester (2004)Google Scholar
  11. 11.
    Stenger, B., Mendonca, P., Cipolla, R.: Model-based 3D tracking of an articulated hand. In: CVPR (2001)Google Scholar
  12. 12.
    Emmorey, K.: Language, cognition, and the brain: insights from sign language research. Erlbaum (2002)Google Scholar
  13. 13.
    Vogler, C., Metaxas, D.: Handshapes and Movements: Multiple-Channel American Sign Language Recognition. In: Camurri, A., Volpe, G. (eds.) GW 2003. LNCS (LNAI), vol. 2915, pp. 247–258. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  14. 14.
    Bauer, B., Kraiss, K.-F.: Towards an Automatic Sign Language Recognition System Using Subunits. In: Wachsmuth, I., Sowa, T. (eds.) GW 2001. LNCS (LNAI), vol. 2298, pp. 64–75. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  15. 15.
    Maragos, P.: Morphological Filtering for Image Enhancement and Feature Detection. In: The Image and Video Processing Handbook, 2nd edn. Elsevier (2005)Google Scholar
  16. 16.
    Dreuw, P., Neidle, C., Athitsos, V., Sclaroff, S., Ney, H.: Benchmark databases for video-based automatic sign language recognition. In: Proc. LREC (2008)Google Scholar
  17. 17.
    Zabulis, X., Baltzakis, H., Argyros, A.: Vision-based Hand Gesture Recognition for Human-Computer Interaction. In: The Universal Access Handbook. LEA (2009)Google Scholar
  18. 18.
    Soille, P.: Morphological Image Analysis: Principles & Applications. Springer (2004)Google Scholar
  19. 19.
    Roussos, A., Theodorakis, S., Pitsikalis, V., Maragos, P.: Affine-invariant modeling of shape-appearance images applied on sign language handshape classification. In: Proc. Int’l Conf. on Image Processing (2010)Google Scholar
  20. 20.
    Gross, R., Matthews, I., Baker, S.: Generic vs. person specific active appearance models. Im. and Vis. Comp. 23, 1080–1093 (2005)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  • Anastasios Roussos
    • 1
  • Stavros Theodorakis
    • 1
  • Vassilis Pitsikalis
    • 1
  • Petros Maragos
    • 1
  1. 1.School of E.C.E.National Technical University of AthensGreece

Personalised recommendations