Comparative Studies on Single-Chanel De-Noising Schemes for In-Car Speech Enhancement

Li, Weifeng; Itou, Katunobu; Takeda, Kazuya; Itakura, Fumitada

doi:10.1007/978-0-387-45976-9_9

Comparative Studies on Single-Chanel De-Noising Schemes for In-Car Speech Enhancement

Weifeng Li^5,6,
Katunobu Itou^5,6,
Kazuya Takeda^5,6 &
…
Fumitada Itakura^5,6

Chapter

801 Accesses

Abstract

This chapter describes a novel single-channel in-car speech enhancement method that attempts to estimate the log spectra of speech with a close-talking microphone. It is based on the nonlinear regression of the log spectra of noisy signal captured by a distant microphone and the estimated noise. We compare the speech enhancement performance of proposed method to those based on spectral subtraction (SS) and short-time spectral attenuation (STSA). The method under consideration provides significant overall quality improvement in our subjective evaluation on the speech enhanced using the regression method. We have conducted isolated word recognition experiments over dataset from 15 real car driving conditions. The proposed adaptive nonlinear regression approach shows an improvement in average word error rate (WER), reductions of 54.2% and 16.5%, respectively, when compared to the original noisy speech and the ETSI advanced front-end experiments of [15].

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Trans. Acoustics, Speech and Signal Processing, vol.ASSP-27, no.2, pp.113–120, 1979.
Article Google Scholar
O. Cappe and J. Laroche, “Evaluation of short-time spectral attenuation techniques for the restoration of music recordings,” IEEE Trans. Speech and Audio Processing, vol.3, no.1, 1995.
Google Scholar
R. Martin, “Speech Enhancement Using MMSE Short Time Spectral Estimation with Gamma Distributed Speech Priors,” Proc. IEEE International Conference on Acoustics, Speech and Signal Processing, pp.253–256, 2002.
Google Scholar
Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error log-spectral amplitude estimator,” IEEE Trans. Acoustics, Speech and Signal Processing, vol.ASSP-33, no.2, pp.443–445, 1985.
Article Google Scholar
W. Li, T. Shinde, H. Fujimura, C. Miyajima, T. Nishino, K. Itou, K. Takeda, and F. Itakura, “Multiple regression of log spectra for in-car speech recognition using multiple distributed microphones,” IEICE Trans, on Information & Systems, E88-D, no.3, pp.384–390, 2005.
Article Google Scholar
W. Li, K. Itou, K. Takeda, and F. Itakura, “Optimizing regression for in-car speech recognition using multiple distributed microphones,” Proc. International Conference on Spoken Language Processing, pp.2689–2692, 2004.
Google Scholar
S. Haykin, Neural Networks — A Comprehensive Foundation, Prentice-Hall, 1999.
Google Scholar
S.R. Quackenbush, T.P. Barnwell, and M.A. Clements, Objective Measures of Speech Quality, Prentice-Hall, 1988.
Google Scholar
J. E. Porter and S. F. Boll, “Optimal estimators for spectral restoration of noisy speech,” Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.18.A.2.1–18.A.2.4, 1984.
Google Scholar
F. Xie and D.V. Compernolle, “Speech enhancement by spectral magnitude estimation — A unifying approach,” Speech Communication, vol. 19, pp.89–104, 1996.
Article Google Scholar
B.L. Sim, Y.C. Tong, J.S. Chang, and C.T. Tan, “A parametric formulation of the generalized spectral subtraction method,” IEEE Trans. Speech and Audio Processing, vol.6, no.4, pp.328–337, 1998.
Article Google Scholar
I. Cohen, “Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging,” IEEE Trans. Speech and Audio Processing, vol. 11, no.5, pp.466–475, 2003.
Article Google Scholar
J.H.L. Hansen and B.L. Pellom, “An effective quality evaluation protocol for speech enhancement algorithms,” Proc. International Conference on Spoken Language Processing, pp.2819–2822, 1998.
Google Scholar
M. Marzinzik, Noise reduction schemes for digital hearing aids and their use for the hearing impaired, Ph.D. thesis, University of Oldenburg, 2000.
Google Scholar
“Speech processing, transmission and quality aspects (STQ); distributed speech recognition; advanced front-end feature extraction algorithm; compression algorithm,” ETSI ES 202050 v1.1.1, 2002.
Google Scholar
N. Kawaguchi, S. Matsubara, I. Kishida, Y. Irie, H. Murao, Y. Yamaguchi, K. Takeda and F. Itakura, “Construction and Analysis of the Multi-layered In-car Spoken Dialogue Corpus,” Chapter 1 in DSP in Vehicular and Mobile Systems, H. Abut, J. H.L. Hansen, and K. Takeda (Editors), Springer, New York, NY, 2005.
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School of Engineering, Graduate School of Information Science, Nagoya University, 464-8603, Japan
Weifeng Li, Katunobu Itou, Kazuya Takeda & Fumitada Itakura
Faculty of Science and Technology, Meijo University Nagoya, 464-8603, Japan
Weifeng Li, Katunobu Itou, Kazuya Takeda & Fumitada Itakura

Authors

Weifeng Li
View author publications
You can also search for this author in PubMed Google Scholar
Katunobu Itou
View author publications
You can also search for this author in PubMed Google Scholar
Kazuya Takeda
View author publications
You can also search for this author in PubMed Google Scholar
Fumitada Itakura
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

San Diego State University, San Diego, California, USA
Héseyin Abut
Sabanci University, Turkey
Héseyin Abut
Center for Robust Speech Systems (CRSS) Department of Electrical Engineering, Erik Jonsson School of Engineering & Computer Science, University of Texas at Dallas, Richardson, TX, USA
John H. L. Hansen
Department of Media Science, Nagoya University, Nagoya, Japan
Kazuya Takeda

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Li, W., Itou, K., Takeda, K., Itakura, F. (2007). Comparative Studies on Single-Chanel De-Noising Schemes for In-Car Speech Enhancement. In: Abut, H., Hansen, J.H.L., Takeda, K. (eds) Advances for In-Vehicle and Mobile Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-45976-9_9

Download citation

DOI: https://doi.org/10.1007/978-0-387-45976-9_9
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-33503-2
Online ISBN: 978-0-387-45976-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics