Skip to main content

Pronunciation Clustering and Modeling of Variability for Appearance-Based Sign Language Recognition

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3881))

Abstract

In this paper, we present a system for automatic sign language recognition of segmented words in American Sign Language (ASL). The system uses appearance-based features extracted directly from the frames captured by standard cameras without any special data acquisition tools. This means that we do not rely on complex preprocessing of the video signal or on an intermediate segmentation step that may produce errors. We introduce a database for ASL word recognition extracted from a publicly available set of video streams. One important property of this database is the large variability of the utterances for each word. To cope with this variability, we propose to model distinct pronunciations of each word using different clustering approaches. Automatic clustering of pronunciations improves the error rate of the system from 28.4% to 23.2%. To model global image transformations, the tangent distance is used within the Gaussian emission densities of the hidden Markov model classifier instead of the Euclidean distance. This approach can further reduce the error rate to 21.5%.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Nam, Y., Wohn, K.: Recognition of Space-Time Hand-Gestures Using Hidden Markov Model. In: Proceedings of the ACM Symposium on Virtual Reality Software and Technology, Hong Kong, July 1996, pp. 51–58 (1996)

    Google Scholar 

  2. Bauer, B., Hienz, H., Kraiss, K.F.: Video-Based Continuous Sign Language Recognition Using Statistical Methods. In: Proceedings of the International Conference on Pattern Recognition, Barcelona, Spain, September 2000, pp. 463–466 (2000)

    Google Scholar 

  3. Starner, T., Weaver, J., Pentland, A.: Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video. IEEE Trans. Pattern Analysis and Machine Intelligence 20(12), 1371–1375 (1998)

    Article  Google Scholar 

  4. Vogler, C., Metaxas, D.: Adapting Hidden Markov Models for ASL Recognition by Using Three-dimensional Computer Vision Methods. In: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, Orlando, FL, October 1997, pp. 156–161 (1997)

    Google Scholar 

  5. Zahedi, M., Keysers, D., Ney, H.: Appearance-based Recognition of Words in American Sign Language. In: Marques, J.S., Pérez de la Blanca, N., Pina, P. (eds.) IbPRIA 2005. LNCS, vol. 3522, pp. 511–519. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  6. Keysers, D., Macherey, W., Ney, H.: Adaptation in Statistical Pattern Recognition Using Tangent Vectors. IEEE Trans. Pattern Analysis and Machine Intelligence 26(2), 269–274 (2004)

    Article  Google Scholar 

  7. Neidle, C., Kegl, J., MacLaughlin, D., Bahan, B., Lee, R.G.: The Syntax of American Sign Language: Functional Categories and Hierarchical Structure. MIT Press, Cambridge (2000)

    Google Scholar 

  8. Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE 77(2), 267–296 (1989)

    Article  Google Scholar 

  9. Simard, P., Le Cun, Y., Denker, J.: Efficient Pattern Recognition Using a New Transformation Distance. In: Advances in Neural Information Processing Systems 5, pp. 50–58. Morgan Kaufmann, San Francisco (1993)

    Google Scholar 

  10. Linde, Y., Buzo, A., Gray, R.: An Algorithm for Vector Quantization Design. IEEE Trans. on Communications 28, 84–95 (1980)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zahedi, M., Keysers, D., Ney, H. (2006). Pronunciation Clustering and Modeling of Variability for Appearance-Based Sign Language Recognition. In: Gibet, S., Courty, N., Kamp, JF. (eds) Gesture in Human-Computer Interaction and Simulation. GW 2005. Lecture Notes in Computer Science(), vol 3881. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11678816_8

Download citation

  • DOI: https://doi.org/10.1007/11678816_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-32624-3

  • Online ISBN: 978-3-540-32625-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics