Pronunciation Clustering and Modeling of Variability for Appearance-Based Sign Language Recognition

Zahedi, Morteza; Keysers, Daniel; Ney, Hermann

doi:10.1007/11678816_8

Pronunciation Clustering and Modeling of Variability for Appearance-Based Sign Language Recognition

Morteza Zahedi²¹,
Daniel Keysers²¹ &
Hermann Ney²¹

Conference paper

1157 Accesses
3 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3881))

Abstract

In this paper, we present a system for automatic sign language recognition of segmented words in American Sign Language (ASL). The system uses appearance-based features extracted directly from the frames captured by standard cameras without any special data acquisition tools. This means that we do not rely on complex preprocessing of the video signal or on an intermediate segmentation step that may produce errors. We introduce a database for ASL word recognition extracted from a publicly available set of video streams. One important property of this database is the large variability of the utterances for each word. To cope with this variability, we propose to model distinct pronunciations of each word using different clustering approaches. Automatic clustering of pronunciations improves the error rate of the system from 28.4% to 23.2%. To model global image transformations, the tangent distance is used within the Gaussian emission densities of the hidden Markov model classifier instead of the Euclidean distance. This approach can further reduce the error rate to 21.5%.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Nam, Y., Wohn, K.: Recognition of Space-Time Hand-Gestures Using Hidden Markov Model. In: Proceedings of the ACM Symposium on Virtual Reality Software and Technology, Hong Kong, July 1996, pp. 51–58 (1996)
Google Scholar
Bauer, B., Hienz, H., Kraiss, K.F.: Video-Based Continuous Sign Language Recognition Using Statistical Methods. In: Proceedings of the International Conference on Pattern Recognition, Barcelona, Spain, September 2000, pp. 463–466 (2000)
Google Scholar
Starner, T., Weaver, J., Pentland, A.: Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video. IEEE Trans. Pattern Analysis and Machine Intelligence 20(12), 1371–1375 (1998)
Article Google Scholar
Vogler, C., Metaxas, D.: Adapting Hidden Markov Models for ASL Recognition by Using Three-dimensional Computer Vision Methods. In: Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, Orlando, FL, October 1997, pp. 156–161 (1997)
Google Scholar
Zahedi, M., Keysers, D., Ney, H.: Appearance-based Recognition of Words in American Sign Language. In: Marques, J.S., Pérez de la Blanca, N., Pina, P. (eds.) IbPRIA 2005. LNCS, vol. 3522, pp. 511–519. Springer, Heidelberg (2005)
Chapter Google Scholar
Keysers, D., Macherey, W., Ney, H.: Adaptation in Statistical Pattern Recognition Using Tangent Vectors. IEEE Trans. Pattern Analysis and Machine Intelligence 26(2), 269–274 (2004)
Article Google Scholar
Neidle, C., Kegl, J., MacLaughlin, D., Bahan, B., Lee, R.G.: The Syntax of American Sign Language: Functional Categories and Hierarchical Structure. MIT Press, Cambridge (2000)
Google Scholar
Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE 77(2), 267–296 (1989)
Article Google Scholar
Simard, P., Le Cun, Y., Denker, J.: Efficient Pattern Recognition Using a New Transformation Distance. In: Advances in Neural Information Processing Systems 5, pp. 50–58. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Linde, Y., Buzo, A., Gray, R.: An Algorithm for Vector Quantization Design. IEEE Trans. on Communications 28, 84–95 (1980)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Lehrstuhl für Informatik VI, Computer Science Department, RWTH Aachen University, D-52056, Aachen, Germany
Morteza Zahedi, Daniel Keysers & Hermann Ney

Authors

Morteza Zahedi
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Keysers
View author publications
You can also search for this author in PubMed Google Scholar
Hermann Ney
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Laboratoire Valoria, Centre de Recherche Yves Coppens, Université de Bretagne-Sud, Campus de Tohannic, 56000, Vannes, France
Sylvie Gibet
Laboratoire Valoria, Université de Bretagne Sud, Campus de Tohannic, 56000, Vannes, France
Nicolas Courty
Laboratoire Valoria, Centre de Recherche Yves Coppens, Université de Bretagne-Sud, Campus Tohannic, France
Jean-François Kamp

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zahedi, M., Keysers, D., Ney, H. (2006). Pronunciation Clustering and Modeling of Variability for Appearance-Based Sign Language Recognition. In: Gibet, S., Courty, N., Kamp, JF. (eds) Gesture in Human-Computer Interaction and Simulation. GW 2005. Lecture Notes in Computer Science(), vol 3881. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11678816_8

Download citation

DOI: https://doi.org/10.1007/11678816_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32624-3
Online ISBN: 978-3-540-32625-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics