Abstract
Achieving reliable performance in a speech recognizer for car telephone applications has been studied intensively for more than a decade. This paper addresses the effects of mismatched conditions and their minimization with respect to the performance of speaker-independent isolated-word recognition in a car-noise environment without considering the Lombard effect. This study is primarily intended to evaluate the dependence of the recognition rate on the signal-to-noise ratio (SNR) of an input signal either without any noise-compensation method or with a noise-compensation or noise-adaptive method and especially to find the appropriate conditions so that an isolated word recognizer can be used in a real car-noise environment. When hidden Markov models (HMMs) are trained on noisy speech with a SNR of l0dB, it is possible to recognize noisy speech with a SNR in the interval from 40 dB to 5 dB with a recognition rate better than 93%. If modified spectral subtraction is used and models are trained on the enhanced speech, the SNR interval increases to 0 dB. If the parallel model combination (PMC) technique is used, there is no need to train models on noisy or enhanced speech. The model adaptation enables recognizing noisy speech with any SNR from 40 to -10 dB with a recognition rate greater than 73% (for a SNR from 40 to 5 dB, the recognition rate is above 93%). In this respect PMC offers great flexibility with better recognition rates than other noise-compensation techniques.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
M. Berouti, R. Schwartz, and J. Makhoul. Enhancement of speech corrupted by acoustic noise. In Proceedings of ICASSP, pp. 208-211, 1979.
S.F. Boll. Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. on ASSP, 27(2):113–120, 1979.
M.K. Brendborg and B. Lindberg. Noise robust recognition using feature selective modelling. In Report of COST 249, Roma, Italy, 1997.
M.K. Brendborg. Toward noise immune automatic speech recognition using phoneme models. Ph.D. Thesis, Aalborg University-Center for PersonKom-munikation, Aalborg, Denmark, 1996.
G. Doblinger. Computationally efficient speech enhancement by spectral minima tracking in subbands. In Proceedings of EUROSPEECH’95, Madrid, Spain, pp. 1513-1516, 1995.
Y. Ephraim and D. Malah. Speech enhancement using a minimum meansquare log-spectral amplitude estimator. IEEE Trans. on ASSP, 33(6):443–445, 1985.
H. Hermansky and N. Morgan. Rasta processing of speech. IEEE Trans. on SAP, 2:578–579, 1994.
G.S. Kang and L.J. Fransen. Quality improvement of LPC-processed noisy speech by using spectral subtraction. IEEE Trans. on ASSP, 37(6):939–942, 1989.
P. Lockwood and J. Boudy. Experiments with non-linear spectral subtractor (NNS), hidden markov models, and the projection for robust speech recognition in cars. Speech Communication, 11:215–228, 1992.
R. Martin. Spectral subtraction based on minimum statistics. In Proceedings of EUSIPCO’94, Edinburgh, Scotland, U.K., pp. 1182-1185,1994.
B.P. Milner and S.V. Vaseghi. Comparison of some noise-compensation methods for speech recognition in adverse environments. IEE Proc.-Vis. Image Signal Process., 141(5):280–288, 1994.
M.J.F. Gales and S.J. Young. HMM recognition in noise using parallel model combination. In Proceedings of EUROSPEECH’93, Berlin, Germany, pp. 837-840, 1993.
P. Pollák, P. Sovka, and J. Uhlíř. Noise suppression system for a car. In Proceedings of EUROSPEEC’93, Berlin, pp. 1073-1076,1993.
P. Sovka and P. Pollák. The study of speech/pause detectors for speech enhancements methods. In Proceedings of EUROSPEECH’95, Madrid, Spain, pp. 1575-1578, 1995.
P. Sovka, P. Pollák, and J. Kybic. Extended spectral subtraction. In Proceedings of EUSIPCO’96, Trieste, Italy, pp. 963-966, 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer Science+Business Media New York
About this chapter
Cite this chapter
Kreisinger, T., Sovka, P., Pollák, P., Uhlíř, J. (1998). Experimental Study of Speech Recognition in Noisy Environments. In: Procházka, A., Uhlíř, J., Rayner, P.W.J., Kingsbury, N.G. (eds) Signal Analysis and Prediction. Applied and Numerical Harmonic Analysis. Birkhäuser, Boston, MA. https://doi.org/10.1007/978-1-4612-1768-8_32
Download citation
DOI: https://doi.org/10.1007/978-1-4612-1768-8_32
Publisher Name: Birkhäuser, Boston, MA
Print ISBN: 978-1-4612-7273-1
Online ISBN: 978-1-4612-1768-8
eBook Packages: Springer Book Archive