Blind Speech Extraction Combining Generalized MMSE STSA Estimator and ICA-Based Noise and Speech Probability Density Function Estimations

Saruwatari, Hiroshi; Okamoto, Ryoi; Takahashi, Yu; Shikano, Kiyohiro

doi:10.1007/978-3-642-15995-4_7

Blind Speech Extraction Combining Generalized MMSE STSA Estimator and ICA-Based Noise and Speech Probability Density Function Estimations

Hiroshi Saruwatari²¹,
Ryoi Okamoto²¹,
Yu Takahashi²¹ &
…
Kiyohiro Shikano²¹

Conference paper

3127 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6365))

Abstract

In this paper, we propose a new blind speech extraction method combining ICA-based dynamic noise estimation and a generalized minimum mean-square-error short-time spectral amplitude estimator of the target speech. To deal with various types of speech signals with different probability density functions (p.d.f.), we also introduce a spectral-subtraction-based speech p.d.f. estimation and provide a theoretical justification of the proposed approach. We conduct an experiment in an actual railway-station environment, and show the improved noise reduction of the proposed method by objective and subjective evaluations.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Takahashi, Y., et al.: Blind spatial subtraction array for speech enhancement in noisy environment. IEEE Trans. Audio, Speech and Lang. Process. 17(4), 650–664 (2009)
Article Google Scholar
Boll, S.F.: Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. ASSP-27(2), 113–120 (1979)
Google Scholar
Okamoto, R., et al.: MMSE STSA estimator with nonstationary noise estimation based on ICA for high-quality speech enhancement. In: Proc. ICASSP, pp. 4778–4781 (2010)
Google Scholar
Ephraim, Y., et al.: Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator. IEEE Trans. ASSP-32(6), 1109–1121 (1984)
Google Scholar
Andrianakis, I., et al.: MMSE speech spectral amplitude estimators with chi and gamma speech priors. In: Proc. ICASSP, vol. 1071, pp. III-1068–III-1071 (2006)
Google Scholar
Saruwatari, H., et al.: Blind source separation combining independent component analysis and beamforming. EURASIP J. Appl. Sig. Process. 2003(11), 1135–1146 (2003)
Article MATH Google Scholar
Stacy, E.W.: A generalization of the gamma distribution. Ann. Math. Stat. 33(3), 1187–1192 (1962)
Article MATH MathSciNet Google Scholar
Hoffmann, E., et al.: Time frequency masking strategy for blind source separation of acoustic signals based on optimally-modified log-spectral amplitude estimator. In: Adali, T., Jutten, C., Romano, J.M.T., Barros, A.K. (eds.) ICA 2009. LNCS, vol. 5441, pp. 581–588. Springer, Heidelberg (2009)
Chapter Google Scholar
Uemura, Y., et al.: Musical noise generation analysis for noise reduction methods based on spectral subtraction and MMSE STSA estimation. In: Proc. ICASSP, pp. 4433–4436 (2009)
Google Scholar

Download references

Author information

Authors and Affiliations

Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara, 630-0192, Japan
Hiroshi Saruwatari, Ryoi Okamoto, Yu Takahashi & Kiyohiro Shikano

Authors

Hiroshi Saruwatari
View author publications
You can also search for this author in PubMed Google Scholar
Ryoi Okamoto
View author publications
You can also search for this author in PubMed Google Scholar
Yu Takahashi
View author publications
You can also search for this author in PubMed Google Scholar
Kiyohiro Shikano
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Electrical Engineering, Universitè d’Evry Val d’Essone, 40 rue du Pelvoux, 91020, Courcouronnes, France
Vincent Vigneron
Laboratoire I3S, Les Algorithmes - Euclide-B, BP 121, Université de Nice-Sophia Antipolis, 2000 Route des Lucioles, 06903, Sophia Antipolis Cedex, France
Vicente Zarzoso
School of Engineering, Dept. of Telecommunications, ISITSchool of Engineering, Dept. of Telecommunications, ISITV, Université de Toulon, Avenue George Pompidou, BP 56, La Valette du Var, Cedex, 83162, France
Eric Moreau
INRIA France, Equipe-projet METISS, Centre de Recherche INRIA Rennes-Bretagne Atlantique, Campus de Beaulieu, 35042, Rennes cedex, France
Rémi Gribonval
INRIA France, Equipe-projet METISS, Centre de Recherche INRIA Rennes-Bretagne Atlantique, Campus de Beaulieu, 35042, Rennes Cedex, France
Emmanuel Vincent

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saruwatari, H., Okamoto, R., Takahashi, Y., Shikano, K. (2010). Blind Speech Extraction Combining Generalized MMSE STSA Estimator and ICA-Based Noise and Speech Probability Density Function Estimations. In: Vigneron, V., Zarzoso, V., Moreau, E., Gribonval, R., Vincent, E. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2010. Lecture Notes in Computer Science, vol 6365. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15995-4_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-15995-4_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15994-7
Online ISBN: 978-3-642-15995-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics