Skip to main content

Lipreading using Fourier transform over time

  • Face Analysis
  • Conference paper
  • First Online:
Book cover Computer Analysis of Images and Patterns (CAIP 1997)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1296))

Included in the following conference series:

Abstract

This paper describes a novel approach to visual speech recognition. The intensity of each pixel in an image sequence is considered as a function of time. One-dimensional Fourier transform is applied to this intensity-versus-time function to model the lip movements. We present experimental results performed on two databases of ten English digits and letters, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. W.E. Adam and F. Bitter, “Advances in Heart Imaging”, Proc. of Int. Symposium on Medical Radionuclide Imaging, 1980.

    Google Scholar 

  2. M. Boehm, U. Obermoeller, and K.H. Hoehne, “Determination of Heart Dynamics from X-Ray and Ultrasound Image Sequences”, Proc. of Int. Conf. on Pattern Recognition, pp. 403–408, 1980.

    Google Scholar 

  3. C. Bregler, S. Manke, H. Hild, and A. Waibel, “Bimodal Sensor Integration on the Example of 'speech-Reading'”, Proc. of IEEE Int. Conf. on Neural Networks, pp. 667–671, 1993.

    Google Scholar 

  4. A.J. Goldschen, O.N. Garcia, and E. Petajan, “Continuous Optical Automatic Speech Recognition by Lipreading”, Proc. of 28th Annual Asilomar Conference on Signals, Systems, and Computers, pp. 572–577, 1995.

    Google Scholar 

  5. M. Hennecke, D.G. Stork, and K.V. Prasad, “Visionary Speech: Looking Ahead to Practical Speechreading Systems”, in Speechreading by Humans and Machines, D.G. Stork and M.E. Hennecke (Eds.), pp. 331–350, 1995.

    Google Scholar 

  6. M. Kirby, F. Weisser, and G. Dangelmayr, “A Model Problem in the Representation of Digital Image Sequences”, Pattern Recognition, Vol. 26, No. 1, pp. 63–73, 1993.

    Article  Google Scholar 

  7. N. Li, S. Dettmer, and M. Shah, “Lipreading Using Eigensequences”, Proc. of Int. Workshop on Automatic Face-and Gesture-Recognition, pp. 30–34, 1995.

    Google Scholar 

  8. J. Luettin, N.A. Thacker, and S.W. Beet, “Visual Speech Recognition Using Active Shape Models and Hidden Markov Models”, Proc. of IEEE Int. Conf. on Acoustic, Speech and Signal Processing, 1996.

    Google Scholar 

  9. U. Meier, W. Hürst, and P. Duchnowski, “Adaptive Bimodal Sensor Fusion for Automatic Speechreading”, Proc. of IEEE Int. Conf. on Acoustic, Speech and Signal Processing, 1996.

    Google Scholar 

  10. J.R. Movellan, “Visual Speech Recognition with Stochastic Networks”, in Advances in Neural Information Processing System, G. Tesauro, D. Toruetzky, and T. Leen (Eds.), Vol. 7, MIT Press, Cambridge, 1995.

    Google Scholar 

  11. J.R. Movellan, “Visual Speech Recognition with Stochastic Networks”, in Advances in Neural Information Processing System, G. Tesauro, D. Toruetzky, and T. Leen (Eds.), Vol. 7, MIT Press, Cambridge, 1995.

    Google Scholar 

  12. C. Nastar and N. Ayache, “Time Representation of Deformations: Combining Vibration Modes and Fourier Analysis”, in Object Representation in Computer Vision, M. Hebert, J. Ponce, T. Boult, and A. Gross (Eds.), pp. 263–275, 1994.

    Google Scholar 

  13. D.G. Stork and M.E. Hennecke (Eds.), Speechreading by Humans and Machines, Springer-Verlag, 1996.

    Google Scholar 

  14. K. Yu, X.Y. Jiang, and H. Bunke, “Lipreading: A Classifier Combination Approach”, accepted by Pattern Recognition in Practice V, 1997.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Gerald Sommer Kostas Daniilidis Josef Pauli

Rights and permissions

Reprints and permissions

Copyright information

© 1997 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yu, K., Jiang, X., Bunke, H. (1997). Lipreading using Fourier transform over time. In: Sommer, G., Daniilidis, K., Pauli, J. (eds) Computer Analysis of Images and Patterns. CAIP 1997. Lecture Notes in Computer Science, vol 1296. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63460-6_152

Download citation

  • DOI: https://doi.org/10.1007/3-540-63460-6_152

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-63460-7

  • Online ISBN: 978-3-540-69556-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics