Abstract
This chapter describes a novel single-channel in-car speech enhancement method that attempts to estimate the log spectra of speech with a close-talking microphone. It is based on the nonlinear regression of the log spectra of noisy signal captured by a distant microphone and the estimated noise. We compare the speech enhancement performance of proposed method to those based on spectral subtraction (SS) and short-time spectral attenuation (STSA). The method under consideration provides significant overall quality improvement in our subjective evaluation on the speech enhanced using the regression method. We have conducted isolated word recognition experiments over dataset from 15 real car driving conditions. The proposed adaptive nonlinear regression approach shows an improvement in average word error rate (WER), reductions of 54.2% and 16.5%, respectively, when compared to the original noisy speech and the ETSI advanced front-end experiments of [15].
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Trans. Acoustics, Speech and Signal Processing, vol.ASSP-27, no.2, pp.113–120, 1979.
O. Cappe and J. Laroche, “Evaluation of short-time spectral attenuation techniques for the restoration of music recordings,” IEEE Trans. Speech and Audio Processing, vol.3, no.1, 1995.
R. Martin, “Speech Enhancement Using MMSE Short Time Spectral Estimation with Gamma Distributed Speech Priors,” Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, pp.253–256, 2002.
Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error log-spectral amplitude estimator,” IEEE Trans. Acoustics, Speech and Signal Processing, vol.ASSP-33, no.2, pp.443–445, 1985.
W. Li, T. Shinde, H. Fujimura, C. Miyajima, T. Nishino, K. Itou, K. Takeda, and F. Itakura, “Multiple regression of log spectra for in-car speech recognition using multiple distributed microphones,” IEICE Trans, on Information & Systems, E88-D, no.3, pp.384–390, 2005.
W. Li, K. Itou, K. Takeda, and F. Itakura, “Optimizing regression for in-car speech recognition using multiple distributed microphones,” Proc. International Conference on Spoken Language Processing, pp.2689–2692, 2004.
S. Haykin, Neural Networks — A Comprehensive Foundation, Prentice-Hall, 1999.
S.R. Quackenbush, T.P. Barnwell, and M.A. Clements, Objective Measures of Speech Quality, Prentice-Hall, 1988.
J. E. Porter and S. F. Boll, “Optimal estimators for spectral restoration of noisy speech,” Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.18.A.2.1–18.A.2.4, 1984.
F. Xie and D.V. Compernolle, “Speech enhancement by spectral magnitude estimation — A unifying approach,” Speech Communication, vol. 19, pp.89–104, 1996.
B.L. Sim, Y.C. Tong, J.S. Chang, and C.T. Tan, “A parametric formulation of the generalized spectral subtraction method,” IEEE Trans. Speech and Audio Processing, vol.6, no.4, pp.328–337, 1998.
I. Cohen, “Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging,” IEEE Trans. Speech and Audio Processing, vol. 11, no.5, pp.466–475, 2003.
J.H.L. Hansen and B.L. Pellom, “An effective quality evaluation protocol for speech enhancement algorithms,” Proc. International Conference on Spoken Language Processing, pp.2819–2822, 1998.
M. Marzinzik, Noise reduction schemes for digital hearing aids and their use for the hearing impaired, Ph.D. thesis, University of Oldenburg, 2000.
“Speech processing, transmission and quality aspects (STQ); distributed speech recognition; advanced front-end feature extraction algorithm; compression algorithm,” ETSI ES 202050 v1.1.1, 2002.
N. Kawaguchi, S. Matsubara, I. Kishida, Y. Irie, H. Murao, Y. Yamaguchi, K. Takeda and F. Itakura, “Construction and Analysis of the Multi-layered In-car Spoken Dialogue Corpus,” Chapter 1 in DSP in Vehicular and Mobile Systems, H. Abut, J. H.L. Hansen, and K. Takeda (Editors), Springer, New York, NY, 2005.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2007 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
Li, W., Itou, K., Takeda, K., Itakura, F. (2007). Comparative Studies on Single-Chanel De-Noising Schemes for In-Car Speech Enhancement. In: Abut, H., Hansen, J.H.L., Takeda, K. (eds) Advances for In-Vehicle and Mobile Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-45976-9_9
Download citation
DOI: https://doi.org/10.1007/978-0-387-45976-9_9
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-33503-2
Online ISBN: 978-0-387-45976-9
eBook Packages: EngineeringEngineering (R0)