The Current Technology and Its Limits: An Overview

Junqua, Jean-Claude; Haton, Jean-Paul

doi:10.1007/978-1-4613-1297-0_6

Jean-Claude Junqua³ &
Jean-Paul Haton⁴

Part of the book series: The Kluwer International Series in Engineering and Computer Science ((SECS,volume 341))

201 Accesses

Summary

This chapter presents an overview of the current technology and the real challenges that we are facing. Then, it discusses the main factors which contribute to the limitations of today’s technology and presents the results of studies which compared human and machine performance on the same task. Finally, it summarizes the most recent advances in robust speech processing. These advances are developed in the next chapters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bates, M. (1994). Models of natural language understanding. In Roe, D. and Wilpon, J., editors, Voice Communication Between Humans And Machines, pages 238–253. National Academy Press.
Google Scholar
Butzberger, J., Murveit, H., Shriberg, E., and Price, P. (1992). Spontaneous speech effects in large vocabulary speech recognition applications. In DARPA Speech and Natural Language Workshop, pages 339–343.
Chapter Google Scholar
Dautrich, B., Rabiner, L., and Martin, T. (1983). On the effects of varying filter bank parameters on isolated word recognition. IEEE Trans. ASSP, ASSP-31(4):793–806.
Article Google Scholar
Furui, S. (1991). Recent advances in speech recognition, In EUROSPEECH, pages 3–10.
Google Scholar
Furui, S. (1992a). Speaker-independent and speaker-adaptive recognition techniques. In Furui, S. and Sondhi, M., editors, Advances in Speech Signal Processing, pages 597–622. Marcel Dekker, Inc.
Google Scholar
Furui, S. (1992b). Towards robust speech recognition under adverse conditions. In ETRW: Speech Processing in Adverse Conditions, pages 31–42.
Google Scholar
Furui, S. (1994). Toward the ultimate synthesis/recognition system. In Roe, D. and Wilpon, J., editors, Voice Communication Between Humans And Machines, pages 450–466. National Academy Press.
Google Scholar
Jelinek, F., Mercer, R., and Roukos, S. (1992). Principles of lexical language modeling for speech recognition. In Fund, S. and Sondhi, M., editors, Advances in Speech Signal Processing, pages 651–700. Marcel Dekker, Inc.
Google Scholar
Juang, B., Chang, P., Chou, W., and Lee, C. (1991). Minimum error rate training for dynamic time warping and hidden Markov model recognizers. In IEEE Speech Recognition Workshop, pages 14–15.
Google Scholar
Junqua, J.-C. and Wakita, H. (1989). A comparative study of cepstral lifters and distance measures for all-pole models of speech in noise. In ICASSP, pages 476–479.
Google Scholar
Kuhl, P., Green, K., Gordon, J., Sanford, D., and Fu, C. (1989). Words recognition by humans and machines: Tests on a multitalker, multistyle database. J. Acoust. Soc. Am., 86 (Suppl. 1, Fall):S77.
Article Google Scholar
Lamel, L. (1988). Formalizing Knowledge Used in Spectrogram Reading: Acoustic and Perceptual Evidence of Stops, Ph.D. thesis. Massachusetts Institute of Technology.
Google Scholar
Lea, W. (1989). Defining, measuring, and pursuing ‘robustness’ in speech recognition. In Lea, W., editor, Towards Robustness in Speech Recognition, pages 25–143. Speech Science Publications.
Google Scholar
Lee, K.-F. and Alleva, F. (1992). Continuous speech recognition. In Furui, S. and Sondhi, M., editors, Advances in Speech Signal Processing, pages 623–650. Marcel Dekker, Inc.
Google Scholar
Makhoul, J. and Schwartz, R. (1994). State of the art in continuous speech recognition. In Roe, D. and Wilpon, J., editors, Voice Communication Between Humans And Machines, pages 165–198. National Academy Press.
Google Scholar
Mari, J.-F., Fohr, D., Anglade, Y., and Junqua, J.-C. (1994). Hidden Markov models and selectively trained neural networks for connected confusable word recognition. In ICSLP, pages 1519–1522.
Google Scholar
Moore, R. (1994). Integration of speech with natural language. In Roe, D. and Wilpon, J., editors, Voice Communication Between Humans And Machines, pages 254–274. National Academy Press.
Google Scholar
Rabiner, L. and Juang, B.-H. (1993). Fundamentals of Speech Recognition. Prentice Hall.
Google Scholar
Roginsky, K. (1991). A neural network phonetic classifier for telephone speech. M.S. Thesis, Oregon Graduate Institute of Science and Technology.
Google Scholar
Shikano, K. and Itakura, F. (1992). Spectrum distance measures for speech recognition. In Furui, S. and Sondhi, M., editors, Advances in Speech Signal Processing, pages 419–452. Marcel Dekker, Inc.
Google Scholar

Download references

Author information

Authors and Affiliations

Speech Technology Laboratory, USA
Jean-Claude Junqua
CRIN - INRIA, France
Jean-Paul Haton

Authors

Jean-Claude Junqua
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Paul Haton
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Junqua, JC., Haton, JP. (1996). The Current Technology and Its Limits: An Overview. In: Robustness in Automatic Speech Recognition. The Kluwer International Series in Engineering and Computer Science, vol 341. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-1297-0_6

Download citation

DOI: https://doi.org/10.1007/978-1-4613-1297-0_6
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4612-8555-7
Online ISBN: 978-1-4613-1297-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics