Lipreading using Fourier transform over time

Yu, Keren; Jiang, Xiaoyi; Bunke, Horst

doi:10.1007/3-540-63460-6_152

Keren Yu¹,
Xiaoyi Jiang¹ &
Horst Bunke¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1296))

Included in the following conference series:

International Conference on Computer Analysis of Images and Patterns

137 Accesses
1 Citations

Abstract

This paper describes a novel approach to visual speech recognition. The intensity of each pixel in an image sequence is considered as a function of time. One-dimensional Fourier transform is applied to this intensity-versus-time function to model the lip movements. We present experimental results performed on two databases of ten English digits and letters, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

W.E. Adam and F. Bitter, “Advances in Heart Imaging”, Proc. of Int. Symposium on Medical Radionuclide Imaging, 1980.
Google Scholar
M. Boehm, U. Obermoeller, and K.H. Hoehne, “Determination of Heart Dynamics from X-Ray and Ultrasound Image Sequences”, Proc. of Int. Conf. on Pattern Recognition, pp. 403–408, 1980.
Google Scholar
C. Bregler, S. Manke, H. Hild, and A. Waibel, “Bimodal Sensor Integration on the Example of 'speech-Reading'”, Proc. of IEEE Int. Conf. on Neural Networks, pp. 667–671, 1993.
Google Scholar
A.J. Goldschen, O.N. Garcia, and E. Petajan, “Continuous Optical Automatic Speech Recognition by Lipreading”, Proc. of 28th Annual Asilomar Conference on Signals, Systems, and Computers, pp. 572–577, 1995.
Google Scholar
M. Hennecke, D.G. Stork, and K.V. Prasad, “Visionary Speech: Looking Ahead to Practical Speechreading Systems”, in Speechreading by Humans and Machines, D.G. Stork and M.E. Hennecke (Eds.), pp. 331–350, 1995.
Google Scholar
M. Kirby, F. Weisser, and G. Dangelmayr, “A Model Problem in the Representation of Digital Image Sequences”, Pattern Recognition, Vol. 26, No. 1, pp. 63–73, 1993.
Article Google Scholar
N. Li, S. Dettmer, and M. Shah, “Lipreading Using Eigensequences”, Proc. of Int. Workshop on Automatic Face-and Gesture-Recognition, pp. 30–34, 1995.
Google Scholar
J. Luettin, N.A. Thacker, and S.W. Beet, “Visual Speech Recognition Using Active Shape Models and Hidden Markov Models”, Proc. of IEEE Int. Conf. on Acoustic, Speech and Signal Processing, 1996.
Google Scholar
U. Meier, W. Hürst, and P. Duchnowski, “Adaptive Bimodal Sensor Fusion for Automatic Speechreading”, Proc. of IEEE Int. Conf. on Acoustic, Speech and Signal Processing, 1996.
Google Scholar
J.R. Movellan, “Visual Speech Recognition with Stochastic Networks”, in Advances in Neural Information Processing System, G. Tesauro, D. Toruetzky, and T. Leen (Eds.), Vol. 7, MIT Press, Cambridge, 1995.
Google Scholar
J.R. Movellan, “Visual Speech Recognition with Stochastic Networks”, in Advances in Neural Information Processing System, G. Tesauro, D. Toruetzky, and T. Leen (Eds.), Vol. 7, MIT Press, Cambridge, 1995.
Google Scholar
C. Nastar and N. Ayache, “Time Representation of Deformations: Combining Vibration Modes and Fourier Analysis”, in Object Representation in Computer Vision, M. Hebert, J. Ponce, T. Boult, and A. Gross (Eds.), pp. 263–275, 1994.
Google Scholar
D.G. Stork and M.E. Hennecke (Eds.), Speechreading by Humans and Machines, Springer-Verlag, 1996.
Google Scholar
K. Yu, X.Y. Jiang, and H. Bunke, “Lipreading: A Classifier Combination Approach”, accepted by Pattern Recognition in Practice V, 1997.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Bern, Switzerland
Keren Yu, Xiaoyi Jiang & Horst Bunke

Authors

Keren Yu
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoyi Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Horst Bunke
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Gerald Sommer Kostas Daniilidis Josef Pauli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yu, K., Jiang, X., Bunke, H. (1997). Lipreading using Fourier transform over time. In: Sommer, G., Daniilidis, K., Pauli, J. (eds) Computer Analysis of Images and Patterns. CAIP 1997. Lecture Notes in Computer Science, vol 1296. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63460-6_152

Download citation

DOI: https://doi.org/10.1007/3-540-63460-6_152
Published: 02 August 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63460-7
Online ISBN: 978-3-540-69556-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics