Summary
This chapter presents an overview of the current technology and the real challenges that we are facing. Then, it discusses the main factors which contribute to the limitations of today’s technology and presents the results of studies which compared human and machine performance on the same task. Finally, it summarizes the most recent advances in robust speech processing. These advances are developed in the next chapters.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bates, M. (1994). Models of natural language understanding. In Roe, D. and Wilpon, J., editors, Voice Communication Between Humans And Machines, pages 238–253. National Academy Press.
Butzberger, J., Murveit, H., Shriberg, E., and Price, P. (1992). Spontaneous speech effects in large vocabulary speech recognition applications. In DARPA Speech and Natural Language Workshop, pages 339–343.
Dautrich, B., Rabiner, L., and Martin, T. (1983). On the effects of varying filter bank parameters on isolated word recognition. IEEE Trans. ASSP, ASSP-31(4):793–806.
Furui, S. (1991). Recent advances in speech recognition, In EUROSPEECH, pages 3–10.
Furui, S. (1992a). Speaker-independent and speaker-adaptive recognition techniques. In Furui, S. and Sondhi, M., editors, Advances in Speech Signal Processing, pages 597–622. Marcel Dekker, Inc.
Furui, S. (1992b). Towards robust speech recognition under adverse conditions. In ETRW: Speech Processing in Adverse Conditions, pages 31–42.
Furui, S. (1994). Toward the ultimate synthesis/recognition system. In Roe, D. and Wilpon, J., editors, Voice Communication Between Humans And Machines, pages 450–466. National Academy Press.
Jelinek, F., Mercer, R., and Roukos, S. (1992). Principles of lexical language modeling for speech recognition. In Fund, S. and Sondhi, M., editors, Advances in Speech Signal Processing, pages 651–700. Marcel Dekker, Inc.
Juang, B., Chang, P., Chou, W., and Lee, C. (1991). Minimum error rate training for dynamic time warping and hidden Markov model recognizers. In IEEE Speech Recognition Workshop, pages 14–15.
Junqua, J.-C. and Wakita, H. (1989). A comparative study of cepstral lifters and distance measures for all-pole models of speech in noise. In ICASSP, pages 476–479.
Kuhl, P., Green, K., Gordon, J., Sanford, D., and Fu, C. (1989). Words recognition by humans and machines: Tests on a multitalker, multistyle database. J. Acoust. Soc. Am., 86 (Suppl. 1, Fall):S77.
Lamel, L. (1988). Formalizing Knowledge Used in Spectrogram Reading: Acoustic and Perceptual Evidence of Stops, Ph.D. thesis. Massachusetts Institute of Technology.
Lea, W. (1989). Defining, measuring, and pursuing ‘robustness’ in speech recognition. In Lea, W., editor, Towards Robustness in Speech Recognition, pages 25–143. Speech Science Publications.
Lee, K.-F. and Alleva, F. (1992). Continuous speech recognition. In Furui, S. and Sondhi, M., editors, Advances in Speech Signal Processing, pages 623–650. Marcel Dekker, Inc.
Makhoul, J. and Schwartz, R. (1994). State of the art in continuous speech recognition. In Roe, D. and Wilpon, J., editors, Voice Communication Between Humans And Machines, pages 165–198. National Academy Press.
Mari, J.-F., Fohr, D., Anglade, Y., and Junqua, J.-C. (1994). Hidden Markov models and selectively trained neural networks for connected confusable word recognition. In ICSLP, pages 1519–1522.
Moore, R. (1994). Integration of speech with natural language. In Roe, D. and Wilpon, J., editors, Voice Communication Between Humans And Machines, pages 254–274. National Academy Press.
Rabiner, L. and Juang, B.-H. (1993). Fundamentals of Speech Recognition. Prentice Hall.
Roginsky, K. (1991). A neural network phonetic classifier for telephone speech. M.S. Thesis, Oregon Graduate Institute of Science and Technology.
Shikano, K. and Itakura, F. (1992). Spectrum distance measures for speech recognition. In Furui, S. and Sondhi, M., editors, Advances in Speech Signal Processing, pages 419–452. Marcel Dekker, Inc.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 1996 Kluwer Academic Publishers
About this chapter
Cite this chapter
Junqua, JC., Haton, JP. (1996). The Current Technology and Its Limits: An Overview. In: Robustness in Automatic Speech Recognition. The Kluwer International Series in Engineering and Computer Science, vol 341. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-1297-0_6
Download citation
DOI: https://doi.org/10.1007/978-1-4613-1297-0_6
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4612-8555-7
Online ISBN: 978-1-4613-1297-0
eBook Packages: Springer Book Archive