Skip to main content

The Current Technology and Its Limits: An Overview

  • Chapter
Robustness in Automatic Speech Recognition

Summary

This chapter presents an overview of the current technology and the real challenges that we are facing. Then, it discusses the main factors which contribute to the limitations of today’s technology and presents the results of studies which compared human and machine performance on the same task. Finally, it summarizes the most recent advances in robust speech processing. These advances are developed in the next chapters.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Bates, M. (1994). Models of natural language understanding. In Roe, D. and Wilpon, J., editors, Voice Communication Between Humans And Machines, pages 238–253. National Academy Press.

    Google Scholar 

  • Butzberger, J., Murveit, H., Shriberg, E., and Price, P. (1992). Spontaneous speech effects in large vocabulary speech recognition applications. In DARPA Speech and Natural Language Workshop, pages 339–343.

    Chapter  Google Scholar 

  • Dautrich, B., Rabiner, L., and Martin, T. (1983). On the effects of varying filter bank parameters on isolated word recognition. IEEE Trans. ASSP, ASSP-31(4):793–806.

    Article  Google Scholar 

  • Furui, S. (1991). Recent advances in speech recognition, In EUROSPEECH, pages 3–10.

    Google Scholar 

  • Furui, S. (1992a). Speaker-independent and speaker-adaptive recognition techniques. In Furui, S. and Sondhi, M., editors, Advances in Speech Signal Processing, pages 597–622. Marcel Dekker, Inc.

    Google Scholar 

  • Furui, S. (1992b). Towards robust speech recognition under adverse conditions. In ETRW: Speech Processing in Adverse Conditions, pages 31–42.

    Google Scholar 

  • Furui, S. (1994). Toward the ultimate synthesis/recognition system. In Roe, D. and Wilpon, J., editors, Voice Communication Between Humans And Machines, pages 450–466. National Academy Press.

    Google Scholar 

  • Jelinek, F., Mercer, R., and Roukos, S. (1992). Principles of lexical language modeling for speech recognition. In Fund, S. and Sondhi, M., editors, Advances in Speech Signal Processing, pages 651–700. Marcel Dekker, Inc.

    Google Scholar 

  • Juang, B., Chang, P., Chou, W., and Lee, C. (1991). Minimum error rate training for dynamic time warping and hidden Markov model recognizers. In IEEE Speech Recognition Workshop, pages 14–15.

    Google Scholar 

  • Junqua, J.-C. and Wakita, H. (1989). A comparative study of cepstral lifters and distance measures for all-pole models of speech in noise. In ICASSP, pages 476–479.

    Google Scholar 

  • Kuhl, P., Green, K., Gordon, J., Sanford, D., and Fu, C. (1989). Words recognition by humans and machines: Tests on a multitalker, multistyle database. J. Acoust. Soc. Am., 86 (Suppl. 1, Fall):S77.

    Article  Google Scholar 

  • Lamel, L. (1988). Formalizing Knowledge Used in Spectrogram Reading: Acoustic and Perceptual Evidence of Stops, Ph.D. thesis. Massachusetts Institute of Technology.

    Google Scholar 

  • Lea, W. (1989). Defining, measuring, and pursuing ‘robustness’ in speech recognition. In Lea, W., editor, Towards Robustness in Speech Recognition, pages 25–143. Speech Science Publications.

    Google Scholar 

  • Lee, K.-F. and Alleva, F. (1992). Continuous speech recognition. In Furui, S. and Sondhi, M., editors, Advances in Speech Signal Processing, pages 623–650. Marcel Dekker, Inc.

    Google Scholar 

  • Makhoul, J. and Schwartz, R. (1994). State of the art in continuous speech recognition. In Roe, D. and Wilpon, J., editors, Voice Communication Between Humans And Machines, pages 165–198. National Academy Press.

    Google Scholar 

  • Mari, J.-F., Fohr, D., Anglade, Y., and Junqua, J.-C. (1994). Hidden Markov models and selectively trained neural networks for connected confusable word recognition. In ICSLP, pages 1519–1522.

    Google Scholar 

  • Moore, R. (1994). Integration of speech with natural language. In Roe, D. and Wilpon, J., editors, Voice Communication Between Humans And Machines, pages 254–274. National Academy Press.

    Google Scholar 

  • Rabiner, L. and Juang, B.-H. (1993). Fundamentals of Speech Recognition. Prentice Hall.

    Google Scholar 

  • Roginsky, K. (1991). A neural network phonetic classifier for telephone speech. M.S. Thesis, Oregon Graduate Institute of Science and Technology.

    Google Scholar 

  • Shikano, K. and Itakura, F. (1992). Spectrum distance measures for speech recognition. In Furui, S. and Sondhi, M., editors, Advances in Speech Signal Processing, pages 419–452. Marcel Dekker, Inc.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Kluwer Academic Publishers

About this chapter

Cite this chapter

Junqua, JC., Haton, JP. (1996). The Current Technology and Its Limits: An Overview. In: Robustness in Automatic Speech Recognition. The Kluwer International Series in Engineering and Computer Science, vol 341. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-1297-0_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-4613-1297-0_6

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4612-8555-7

  • Online ISBN: 978-1-4613-1297-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics