Abstract
An automatic speechreading recognizer uses information about motions produced by the oral-cavity regions1 of a speaker uttering a sentence. The ability to automatically ‘lipread’ a speaker using a sequence of image frames is an example of motion-based recognition.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This work has no reference to Mitre past or present.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Michael Andeberg. Cluster Analysis for Applications. Academic Press, New York, NY, 1973.
Christoph Bregler, Hermann fluid, Stefan Manke, and Alex Waibel. Improving connected letter recognition by lipreading. In International Joint Conference on Speech and Signal Processing, volume 1, pages 557–560. IEEE, April 1993.
Christoph Bregler, Stephen Omohundro, and Yochai Konig. A hybrid approach to bimodal speech recognition. In 28th Annual Asilomar Conference on Signals, Systems, and Computers. IEEE, October 1994.
N. Michael Brooke and Eric D. Petajan. Seeing speech: Investigations into the synthesis and recognition of visible speech movements using automatic image processing and computer graphics. In Proceedings of the International Conference on Speech Input/Output: Techniques and Applications, pages 104–109, London, 1986.
J. Burchett. Lipreading: A Handbook of Visible Speech. The Royal National Institute for the Deaf, London, England, 1965.
Roberta Cerio. Personal communications, February 1989.
Greg Chiou and Jenq-Neng Hwang. Lipreading from color motion video. In International Conference on Acoustics, Speech, and Signal Processing, pages 2156–2159. IEEE, May 1996.
Michael Cohen and Dominic Massaro. What can visual speech synthesis tell visible speech recognition. In Proceedings of the 28th Asilomar Conference on Signals, Systems, and Computers, pages 566–571. IEEE, October 1994.
Orin Cornett. Personal communications, February 1989.
L. Erman and V. Lesser. The Hearsay-II speech understanding system: A tutorial. In A. Waibel and K. Lee, editors, Readings in Speech Recognition, pages 235–245. Morgan Kaufmann Publishers, 1990.
Kathleen Finn. An Investigation of Visible Lip Information to be used in Automatic Speech Recognition. PhD thesis, Georgetown University, Washington, DC, 1986.
C. G. Fisher. Confusions among visually perceived consonants. Journal of Speech and Hearing Research, 11: 796–804, 1968.
Oscar Garcia, Alan Goldschen, and Eric Petajan. Feature extraction for optical automatic speech recognition or automatic lipreading. Technical Report GWUIIST-92–32, The George Washington University, November 1992. Department of Electrical Engineering and Computer Science.
Alan Goldschen. Continuous Automatic Speech Recognition by Lipreading. PhD thesis, The George Washington University, Washington, DC, 1993.
Alan Goldschen, Oscar Garcia, and Eric Petajan. Continuous optical automatic speech recognition. In Proceedings of the 28th Asilomar Conference on Signals, Systems, and Computers, pages 572–577. IEEE, October 1994.
Alan Goldschen, Oscar Garcia, and Eric Petajan. Rationale for phoneme-viseme mapping and feature selection in visual speech recognition. In David Stork, editor, Speechreading by Man and Machine: Models, Systems, and Applications, NATO Advanced Study Institute. Springer-Verlag, (in press).
Elizabeth Hazard. Lipreading: For the Oral Deaf and Hard-of-Hearing Person. Charles C. Thomas, Springfield, Illinois, 1971.
Marcus Hennecke, K. Prasad, and David Stork. Using deformable templates to infer visual speech dynamics. In 28th Annual Asilornar Conference on Signals, Systems, and Computers. IEEE, October 1994.
Frederick Jelinek. Continuous speech recognition by statistical methods. Proceedings of the IEEE, 64: 532–556, 1976.
Frederick Jelinek. Self-organized continuous speech recognition. ln Jean-Paul Haton, editor, Automatic Speech and Analysis Recognition, pages 231–238. Reidel Publishing Company, 1982.
Jr. John Deller, John Proakis,, and John Hansen. Discrete-Time Processing of Speech Signals. Macmillan Publishing Company, New York, NY, 1993.
I.T. Jolliffe. Principal Component Analysis. Springer-Verlag, New York, NY, 1986.
Biing Juang and Lawrence Rabiner. A probabilistic distance measure for hidden markov models. ATT Technical Journal, 64 (2): 391–408, February 1985.
Kai Fu Lee. Automatic Speech Recognition: The Development of the Sphinx System. PhD thesis, Carnegie-Mellon University, Pittsburgh, PA 15213, 1989.
Stephen Levinson, Lawrence Rabiner, and Man Mohan Sondhi. An introduction to the application of theory of probabilistic function of a markov process to automatic speech recognition. The Bell System Technical Journal, 62 (4): 1035–1074, April 1983.
Juergen Luettin, Neil Thacker, and Steve Beet. Speech reading using shape and intensity information. In International Conference on Spoken Language Processing, pages 58–61. IEEE, October 1996.
Juergen Luettin, Neil Thacker, and Steve Beet. Visual speech recognition using active shape models and hidden markov models. In International Conference on Acoustics, Speech, and Signal Processing, pages 817–820. IEEE, May 1996.
M.W. Mak and W.G. Allen. Lip-motion analysis for speech segmentation in noise. Speech Communication, 14: 279–296, 1994.
Glenn Martin and Mubarak Shah. Lipreading using optical flow. In Proceedings National Conference on Undergraduate Research, March 1995.
Kenji Mase and Alex Pentland. Automatic lipreading by optical flow analysis. Systems and Computer in Japan, 22 (6): 67–76, 1991.
Iain Matthews, J Bangham, and Stephen Cox. Audiovisual speech recognition using multiscale nonlinear image decomposition. In International Conference on Spoken Language Processing, pages 38–41. IEEE, October 1996.
Harry McGurk and John MacDonald. Hearing lips and seeing voices. Nature, 264:746–748, December 23 /30 1976.
Uwe Meier, Wolfgang Hurst, and Paul Duchnowski. Adaptive bimodal sensor fusion for automatic lipreading. In International Conference on Acoustics, Speech, and Signal Processing, pages 833–836. IEEE, May 1996.
Allen Montgomery and Pamela Jackson. Physical characteristics of the lips underlying vowel lipreading performance. Journal of Acoustical Society of America, 73 (6): 2134–2144, June 1983.
Nishida. Speech recognition enhancement by lip information. ACM SIGCHI Bulletin, 17 (4): 198–204, April 1986.
NIST, Gaithersburg, MD 20899. DARPA TIMIT CD-ROM, November 1988.
Catherine Pelachaud, Norman Badler, and Marie-Luce Viaud. Final report to NSF of the standards for facial animation workshop. Technical report, University of Pennsylvania, Philadelphia, PA, October 1994.
Alex Pentland and Kenji Mase. Lip reading: Automatic visual recognition of spoken words. Technical Report MIT Media Lab Vision Science Technical Report117, Massachusetts Institute of Technology, January 15 1989.
Eric Petajan. Automatic Lipreading to Enhance Speech Recognition. PhD thesis, University of Illinois at Urbana-Champaign, 1984.
Eric Petajan. Automatic lipreading to enhance speech recognition. In Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition,pages 40–47, San Francisco, CA, 1985. IEEE.
Eric Petajan, Bradford Bischoff, David Bodoff, and N. Michael Brooke. An improved automatic lipreading system to enhance speech recognition. In CHI-88, pages 19–25. ACM, 1988.
Gordan Peterson and Harold Barney. Control methods used in a study of the vowels. Journal of Acoustical Society of American, 24: 175–184, March 1952.
Lawrence Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. In Alex Waibel and Kai-Fu Lee, editors, Readings in Speech Recognition, pages 267–296. Morgan Kaufmann Publishers, Inc., 1990.
Lawrence Rabiner and Bing-Hwang Juang. Fundamentals of Speech Recognition. Prentice-Hall, 1993.
Peter Silsbee. Computer Lipreading for Improved Accuracy in Automatic Speech Recognition. PhD thesis, The University of Texas at Austin, 1993.
Steve Smith. Computer lip reading to augment automatic speech recognition. Speech Tech, pages 175–181, 1989.
David Stork, Greg Wolff, and Earl Levine. Neural network lipreading system for improved speech recognition. International Joint Conference of Neural Networks, 1992.
Quentin Summerfield. Some preliminaries to a comprehensive account of audiovisual speech perception. In Barbara Dodd and Ruth Campbell, editors, Hearing by Eye: The Psychology of Lipreading, pages 3–51. Lawrence Earlbaum Associated, 1987.
Henry Tobin. Personal communications, February 1989.
M. Tomlinson, M. Russell, and N. Brooke. Integrating audio and visual information to provide highly robust speech recognition. In International Conference on Acoustics, Speech, and Signal Processing, pages 821–824. IEEE, May 1996.
Brian Walden, Robert Prosek, Allen Montgomery, Charlene Scherr, and Carla Jones. Effects of training on the visual recognition of consonant. Journal of Speech and Hearing Research, 20: 130–145, 1977.
Gill Waters. Speech production and perception. In Chris Rowden, editor, Speech Processing, pages 1–33. McGraw-Hill International, 1992.
Jian-Tong Wu, Shinichi Tamura, Hiroshi Mitsumoto, Hideo Kawai, Kenji Kurosu, and Kozo Okazaki. Neural network vowel-recognition jointly using voice features and mouth shape image. Pattern Recognition, 24 (10): 921–927, 1991.
Ben Yuhas, Moise Goldstein, and Terrence Sejnowski. Integration of acoustic and visual speech signals using neural networks. IEEE Communications Magazine, pages 65–71, 1989.
A. Yuille, P. Hallinan, and D. Cohen. Snakes: Active contour models. International Journal on Computer Vision, 8: 99–112, 1992.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1997 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Goldschen, A.J., Garcia, O.N., Petajan, E.D. (1997). Continuous Automatic Speech Recognition by Lipreading. In: Shah, M., Jain, R. (eds) Motion-Based Recognition. Computational Imaging and Vision, vol 9. Springer, Dordrecht. https://doi.org/10.1007/978-94-015-8935-2_14
Download citation
DOI: https://doi.org/10.1007/978-94-015-8935-2_14
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-4870-7
Online ISBN: 978-94-015-8935-2
eBook Packages: Springer Book Archive