Robust Speech Recognition

Furui, Sadaoki

doi:10.1007/978-3-642-60087-6_11

Sadaoki Furui²

Part of the book series: NATO ASI Series ((NATO ASI F,volume 169))

227 Accesses

Summary

This paper overviews the main technologies that have recently been developed for making speech recognition systems more robust against acoustic variations. These technologies are reviewed from the viewpoint of a stochastic pattern matching paradigm for speech recognition. Improved robustness enables better speech recognition over a wide range of unexpected and adverse conditions by reducing mismatches between training and testing speech utterances.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Acero, A. and stem, R. M., “Environmental robustness in automatic speech recognition,” Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, Albuquerque, S 15b. 11, pp. 849–852 (1990)
Google Scholar
Bellegarda, J. R., De Sousa, P. V., Nadas, A. J., Nahamoo, D., Picheny, M. A. and Bahl, L. R., “The metamorphic algorithm, a speaker mapping approach to data augmentation,” IEEE Trans. Speech and Audio Processing, Vol. 2, No. 3, pp. 413–420 (1994)
Article Google Scholar
Cox, S. J. and Bridle, J. S., “Unsupervised speaker adaptation by probabilistic fitting,” Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, Glasgow, Scottland, S6.ll, pp. 294–297 (1989)
Google Scholar
Cox, S. J., “Predictive speaker adaptation in speech recognition,” Computer Speech and Language, Vol. 9, pp. 1–17 (1995)
Article Google Scholar
Digalakis, V. and Neumeyer, L, L., “Speaker adaptation using combined transformation and Bayesian methods,” Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, Detroit, pp. I–680–683 (1995).
Google Scholar
Furui, S., “A training procedure for isolated word recognition systems,” IEEE Trans. Acoust, Speech Signal Processing, Vol. 28, No. 2, pp. 129–136 (1980).
Article MathSciNet Google Scholar
Furui, S., “Research on individuality features in speech waves and automatic speaker recognition techniques,” Speech Communication, Vol. 5, No. 2, pp. 183–197 (1986)
Article Google Scholar
Furui, S., “Unsupervised speaker adaptation method based on hierarchical spectral clustering,” Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, Glasgow, S6. 9, pp. 286–289 (1989)
Google Scholar
Furui, S., Digital Speech Processing, Synthesis and Recognition;, Marcel Dekker, New York (1989).
Google Scholar
Furui, S., “Speaker-dependent-feature extraction, recognition and processing techniques,” Speech Communication, Vol. 10, Nos. 5–6, pp. 505–520 (1991).
Article Google Scholar
Furui, S., “Speaker-independent and speaker-adaptive recognition techniques,” in Advances in Speech Signal Processing, edited by S. Furui and M. M. Sondhi, pp. 597–622 (1992).
Google Scholar
Furui, S., “Toward robust speech recognition under adverse conditions,” Proc. ESCA Workshop on Speech Processing in Adverse Conditions, Cannes-Mandelieu, France, pp. 31–42(1992)
Google Scholar
Furui, S., “Flexible speech recognition,” Proc. Eurospeech, Madrid, pp. 1595–1603 (1995)
Google Scholar
Furui, S., “Recent advances in robust speech recognition,” Proc. ESCA-NATO Workshop on Robust Speech Recognition for Unknown Communication Channels, Pont-a-Mousson, pp. 11–20 (1997).
Google Scholar
Gauvain, J.-L. and Lee, C.-H., “Bayesian learning for hidden Markov models with Gaussian mixture state observation densities,” Speech Communication, Vol. 11, Nos. 2–3, pp. 205–214 (1992).
Article Google Scholar
Juang, B. H., “Recent developments in speech recognition under adverse conditions,” Proc. Int. Conf. Spoken Language Processing, Kobe, 25.1, pp. 1113–1116 (1990).
Google Scholar
Juang, B.-H., “Speech recognition in adverse environments,” Computer Speech and Language, Vol. 5, pp. 275–294 (1991)
Article Google Scholar
Junqua, J. C. and Anglade, Y., “Acoustic and perceptual studies of Lombard speech: Application to isolated-words automatic speech recognition,” Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, Albuquerque, S15b. 9, pp. 841–844 (1990)
Google Scholar
Kato, K. and Furui, S., “Listener adaptability for individual voice in speech perception”, Trans. Committee of Hearing Research, H85-5 (1985).
Google Scholar
Lee, C.-H. and Gauvain, J.-L., “Bayesian adaptive learning and MAP estimation of HMM,” in Advanced Topics in Automatic Speech and Speaker Recognition, edited by C.-H. Lee, K. K. Paliwal and F. K. Soong, Kluwer Academic Publishers, pp. 83–107 (1995).
Google Scholar
Leggetter, C. J. and Woodland, P. C., “Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models,” Computer Speech and Language, Vol. 9, pp. 171–185 (1995).
Article Google Scholar
Matsui, T. and Furui, S., “N-best-based instantaneous speaker adaptation method for speech recognition,” Proc. Int. Conf. Spoken Language Processing, Philadelphia, pp. 973–976 (1996)
Google Scholar
Matsui, T., Matsuoka, T. and Furui, S., “Smoothed N-best-based speaker adaptation for speech recognition,” Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, Munich, pp. 1015–1018 (1997).
Google Scholar
Matsuoka, T. and Lee, C.-H., “A study of on-line Bayesian adaptation for HMM-based speech recognition,” Proc. Eurospeech, Berlin, pp. 815–818 (1993).
Google Scholar
Ohkura, K., Sugiyama, M. and Sagayama, S., “Speaker adaptation based on transfer vector field smoothing with continuous mixture density HMMs,” Proc. Int. Conf. Spoken Language Processing, Banff, We.fPM. 1. 1, pp. 369–372 (1992)
Google Scholar
Sankar, A. and Lee, C.-H, C.-H., “Robust speech recognition based on stochastic matching,” Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, Detroit, pp. I–121–124 (1995)
Google Scholar
Sankar, A. and Lee, C.-H., “A maximum-likelihood approach to stochastic matching for robust speech recognition,” IEEE Trans. Speech and Audio Processing, Vol. 4, No. 3, pp. 190–202 (1996).
Article Google Scholar
Schwartz, R., Chow, Y.-L. and Kubala, F., “Rapid speaker adaptation using a probabilistic spectral mapping,” Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, Dallas, 15.3, pp. 633–636 (1987).
Google Scholar
Shikano, K., Lee, K.-F. and Reddy, R., “Speaker adaptation through vector quantization,” Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, Tokyo, 49.5, pp. 2643–2646 (1986).
Google Scholar
Zavaliagkos, G., Schwartz, R. and Makhoul, J, J., “Batch, incremental and instantaneous adaptation techniques for speech recognition,” Proc. Int. Conf. Acoust., Speech, Signal Processing, Detroit, pp. I–676–679 (1995).
Google Scholar
Zhao, Y., “An acoustic-phonetic-based speaker adaptation technique for improving speaker-independent continuous speech recognition,” IEEE Trans. Speech and Audio Processing, Vol. 2, No. 3, pp. 380394 (1994).
Article Google Scholar
Zhao, Y., “Robust speaker characterization,” Proc. IEEE Automatic Speech Recognition Workshop, Snowbird, pp. 101-102 (1995).
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Tokyo Institute of Technology, 2-12-1, Ookayama, Meguro-ku, Tokyo, 152, Japan
Sadaoki Furui

Authors

Sadaoki Furui
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Speech Research Unit, DERA Malvern, St. Andrew’s Road, WR14 4DT, Great Malvern, Worcs, UK
Keith Ponting

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Furui, S. (1999). Robust Speech Recognition. In: Ponting, K. (eds) Computational Models of Speech Pattern Processing. NATO ASI Series, vol 169. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-60087-6_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-60087-6_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-64250-0
Online ISBN: 978-3-642-60087-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics