Modality Combination Techniques for Continuous Sign Language Recognition

Forster, Jens; Oberdörfer, Christian; Koller, Oscar; Ney, Hermann

doi:10.1007/978-3-642-38628-2_10

Jens Forster¹⁹,
Christian Oberdörfer¹⁹,
Oscar Koller¹⁹ &
…
Hermann Ney¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7887))

Included in the following conference series:

Iberian Conference on Pattern Recognition and Image Analysis

1933 Accesses
16 Citations

Abstract

Sign languages comprise parallel aspects and use several modalities to form a sign but so far it is not clear how to best combine these modalities in the context of statistical sign language recognition. We investigate early combination of features, late fusion of decisions, as well as synchronous combination on the hidden Markov model state level, and asynchronous combination on the gloss level. This is done for five modalities on two publicly available benchmark databases consisting of challenging real-life data and less complex lab-data, the state-of-the-art typically focusses on. Using modality combination, the best published word error rate on the SIGNUM database (lab-data) is improved from 11.9% to 10.7% and from 55% to 41.9% on the RWTH-PHOENIX database (challenging real-life data).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

van Agris, U., Knorr, M., Kraiss, K.F.: The significance of facial features for automatic sign language recognition. In: FG, pp. 1–6 (September 2008)
Google Scholar
Bengio, S.: An asynchronous hidden Markov model for audio-visual speech recognition. In: NIPS (2003)
Google Scholar
Deng, J., Tsui, H.T.: A Two-step Approach based on PaHMM for the Recognition of ASL. In: ACCV (January 2002)
Google Scholar
Dreuw, P., Deselaers, T., Rybach, D., Keysers, D., Ney, H.: Tracking using dynamic programming for appearance-based sign language recognition. In: FG, pp. 293–298 (2006)
Google Scholar
Forster, J., Schmidt, C., Hoyoux, T., Koller, O., Zelle, U., Piater, J., Ney, H.: Rwth-phoenix-weather: A large vocabulary sign language recognition and translation corpus. In: LREC (May 2012)
Google Scholar
Gweth, Y., Plahl, C., Ney, H.: Enhanced continuous sign language recognition using PCA and neural network features. In: CVPR 2012 Workshop on Gesture Recognition (June 2012)
Google Scholar
Hoffmeister, B., Schlüter, R., Ney, H.: Icnc and Irover: The limits of improving system combination with classification? In: Interspeech, pp. 232–235 (September 2008)
Google Scholar
Kläser, A., Marszałek, M., Schmid, C.: A spatio-temporal descriptor based on 3d-gradients. In: BMVC, pp. 995–1004 (September 2008)
Google Scholar
Luettin, J., Potamianos, G., Neti, C.: Asynchronous stream modeling for large vocabulary audio-visual speech recognition. In: ICASSP, pp. 169–172 (2001)
Google Scholar
Nakamura, S., Kumatani, K., Tamura, S.: Multi-modal temporal asynchronicity modeling by product HMMs for robust audio-visual speech recognition. In: Multimodal Interfaces, pp. 305–309 (2002)
Google Scholar
Nefian, A.V., Liang, L., Pi, X., Liu, X., Murphy, K.: Dynamic bayesian networks for audio-visual speech recognition. EURASIP J. Appl. Signal Process. 2002(1), 1274–1288 (2002)
MATH Google Scholar
Nefian, A.V., Liang, L., Pi, X., Xiaoxiang, L., Mao, C., Murphy, K.: A coupled hmm for audio-visual speech recognition. In: ICASSP, pp. 2013–2016 (2002)
Google Scholar
Ong, E.J., Cooper, H., Pugeault, N., Bowden, R.: Sign language recognition using sequential pattern trees. In: CVPR, June 16-21, pp. 2200–2207 (2012)
Google Scholar
Rybach, D., Gollan, C., Heigold, G., Hoffmeister, B., Lööf, J., Schlüter, R., Ney, H.: The RWTH Aachen University open source speech recognition system. In: INTERSPEECH, pp. 2111–2114 (2009)
Google Scholar
Theodorakis, S., Katsamanis, A., Maragos, P.: Product-HMMs for Automatic Sign Language Recognition. In: ICASSP, pp. 1601–1604 (2009)
Google Scholar
Tran, K., Kakadiaris, I.A., Shah, S.K.: Fusion of human posture features for continuous action recognition. In: ECCV Workshop on Sign, Gesture and Activity, SGA (2011)
Google Scholar
Verma, A., Faruquie, T., Neti, C., Basu, S., Senior, A.: Late integration in audio-visual continuous speech recognition. In: ASRU (1999)
Google Scholar
Vogler, C., Metaxas, D.: Parallel Hidden Markov Models for American Sign Language Recognition. In: ICCV, pp. 116–122 (1999)
Google Scholar
Wang, C., Chen, X., Gao, W.: Expanding Training Set for Chinese Sign Language Recognition. In: FG (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Human Language Technology and Pattern Recognition Group, RWTH Aachen University, Aachen, Germany
Jens Forster, Christian Oberdörfer, Oscar Koller & Hermann Ney

Authors

Jens Forster
View author publications
You can also search for this author in PubMed Google Scholar
Christian Oberdörfer
View author publications
You can also search for this author in PubMed Google Scholar
Oscar Koller
View author publications
You can also search for this author in PubMed Google Scholar
Hermann Ney
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute for Systems and Robotics, Instituto Superior Técnico, Portugal
João M. Sanches
University of Alicante, Spain
Luisa Micó
INESC and University of Porto, Porto, Portugal
Jaime S. Cardoso

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Forster, J., Oberdörfer, C., Koller, O., Ney, H. (2013). Modality Combination Techniques for Continuous Sign Language Recognition. In: Sanches, J.M., Micó, L., Cardoso, J.S. (eds) Pattern Recognition and Image Analysis. IbPRIA 2013. Lecture Notes in Computer Science, vol 7887. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38628-2_10

Download citation

DOI: https://doi.org/10.1007/978-3-642-38628-2_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38627-5
Online ISBN: 978-3-642-38628-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics