Least Squares Filtering of Speech Signals for Robust ASR
- 1.4k Downloads
The behavior of the least squares filter (LeSF) is analyzed for a class of non-stationary signals that are composed of multiple sinusoids whose frequencies, phases and the amplitudes may vary from block to block and which are embedded in white noise. Analytic expressions for the weights and the output of the LeSF are derived as a function of the block length and the signal SNR computed over the corresponding block. Recognizing that such a sinusoidal model is a valid approximation to the speech signals, we have used LeSF filter estimated on each block to enhance the speech signals embedded in white noise. Automatic speech recognition (ASR) experiments on a connected numbers task, OGI Numbers95 show that the proposed LeSF based features yield an increase in speech recognition performance in various non-stationary noise conditions when compared directly to the un-enhanced speech and noise robust JRASTA-PLP features.
KeywordsSpeech Signal Automatic Speech Recognition Frame Length Noisy Speech Automatic Speech Recognition System
Unable to display preview. Download preview PDF.
- 1.Satorius, E., Zeidler, J., Alexander, S.: Linear predictive digital filtering of narrowband processes in additive broad-band noise. Naval Ocean Systems Center, San Diego, CA, Tech. Rep. 331 (November 1978)Google Scholar
- 2.Anderson, C.M., Satorius, E.H., Zeidler, J.R.: Adaptive Enhancement of Finite Bandwidth Signals in White Gaussian Noise. IEEE Trans. on ASSP ASSP-31(1) (February 1983)Google Scholar
- 3.Zeidler, J.R., Satorius, E.H., Chabries, D.M., Wexler, H.T.: Adaptive Enhancement of Multiple Sinusoids in Uncorrelated Noise. IEEE Trans. on ASSP ASSP-26(3) (June 1978)Google Scholar
- 4.Hermansky, H., Morgan, N.: Rasta Processing of Speech. IEEE Trans. on SAP 2(4) (October 1994)Google Scholar
- 5.Sambur, M.R.: Adaptive noise canceling for Speech signals. IEEE Trans. on ASSP ASSP-26(5) (October 1978)Google Scholar
- 7.McAulay, R.J., Quatieri, T.F.: Speech Analysis/Synthesis Based on a Sinusoidal Representation. IEEE Trans. on ASSP ASSP-34(4) (August 1986)Google Scholar
- 17.Davis, S.B., Mermelstein, P.: Comparison of Parametric Representation for Monosyllabic Word Recognition in Continuously Spoken Sentences. IEEE Trans. on ASSP ASSP-28(4) (August 1980)Google Scholar
- 18.Young, S., Odell, J., Ollason, D., Valtchev, V., Woodland, P.: The HTK Book. Cambridge University, Cambridge (1995)Google Scholar
- 19.Cole, R.A., Fanty, M., Lander, T.: Telephone speech corpus at CSLU. In: Proc. of ICSLP, Yokohama, Japan (1994)Google Scholar
- 20.Varga, A., Steeneken, H., Tomlinson, M., Jones, D.: The NOISEX-92 study on the effect of additive noise on automatic speech recognition. Technical report, DRA Speech Research Unit, Malvern, England (1992)Google Scholar