Frequency-Domain Blind Source Separation

Makino, Shoji; Sawada, Hiroshi; Araki, Shoko

doi:10.1007/978-1-4020-6479-1_2

Shoji Makino³,
Hiroshi Sawada³ &
Shoko Araki³

Part of the book series: Signals and Communication Technology ((SCT))

2548 Accesses
3 Citations

This chapter explains the frequency-domain approach to the blind source separation of acoustic signals mixed in a real room environment. With the application of short-time Fourier transforms, convolutive mixtures in the time domain can be approximated as multiple instantaneous mixtures in the frequency domain. So, separation is performed in each frequency bin with a simple instantaneous separation matrix.We employ complex-valued independent component analysis (ICA) to calculate the separation matrix. Then, the permutation ambiguity of the ICA solutions should be aligned so that the separated signals are constructed properly in the time domain. We estimate the time difference of arrival (TDOA) of a source at microphones from the ICA solutions. The frequency-dependent TDOA estimations are then clustered in order to align the permutation ambiguities. We also consider the use of time–frequency masking for a case where the separation by linear filters is insufficient when the sources outnumber the microphones. Experimental results are shown for a simple 3-source 3-microphone case, and also for a rather complicated case with many background interference signals.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

T. W. Lee, Independent Component Analysis - Theory and Applications. Kluwer Academic Publishers, 1998.
Google Scholar
S. Haykin, Ed., Unsupervised Adaptive Filtering (Volume I: Blind Source Sep-aration). John Wiley & Sons, 2000.
Google Scholar
A. Hyvärinen, J. Karhunen, and E. Oja, Independent Component Analysis. John Wiley & Sons, 2001.
Google Scholar
A. Cichocki and S. Amari, Adaptive Blind Signal and Image Processing. John Wiley & Sons, 2002.
Google Scholar
S. Amari, S. Douglas, A. Cichocki, and H. Yang, “Multichannel blind deconvo-lution and equalization using the natural gradient,” in Proc. IEEE Workshop on Signal Processing Advances in Wireless Communications, Apr. 1997, pp. 101-104.
Google Scholar
M. Kawamoto, K. Matsuoka, and N. Ohnishi, “A method of blind separation for convolved non-stationary signals,” Neurocomputing, vol. 22, pp. 157-171, 1998.
Article MATH Google Scholar
K. Matsuoka and S. Nakashima, “Minimal distortion principle for blind source separation,” in Proc. ICA 2001, Dec. 2001, pp. 722-727.
Google Scholar
S. C. Douglas and X. Sun, “Convolutive blind separation of speech mixtures using the natural gradient,” Speech Communication, vol. 39, pp. 65-78, 2003.
Article MATH Google Scholar
H. Buchner, R. Aichner, and W. Kellermann, “Blind source separation for con-volutive mixtures: A unified treatment,” in Audio Signal Processing for NextGeneration Multimedia Communication Systems, Y. Huang and J. Benesty, Eds. Kluwer Academic Publishers, Feb. 2004, pp. 255-293.
Google Scholar
T. Takatani, T. Nishikawa, H. Saruwatari, and K. Shikano, “High-fidelity blind separation of acoustic signals using SIMO-model-based independent component analysis,” IEICE Trans. Fundamentals, vol. E87-A, no. 8, pp. 2063-2072, Aug. 2004.
Google Scholar
S. C. Douglas, H. Sawada, and S. Makino, “A spatio-temporal FastICA algo-rithm for separating convolutive mixtures,” in Proc. ICASSP 2005, vol. V, Mar. 2005, pp. 165-168.
Google Scholar
R. Aichner, H. Buchner, F. Yan, and W. Kellermann, “A real-time blind source separation scheme and its application to reverberant and noisy acoustic envi-ronments,” Signal Process., vol. 86, no. 6, pp. 1260-1277, 2006.
Article MATH Google Scholar
P. Smaragdis, “Blind separation of convolved mixtures in the frequency domain,” Neurocomputing, vol. 22, pp. 21-34, 1998.
Article MATH Google Scholar
L. Parra and C. Spence, “Convolutive blind separation of non-stationary sources,” IEEE Trans. Speech Audio Processing, vol. 8, no. 3, pp. 320-327, May 2000.
Article Google Scholar
L. Schobben and W. Sommen, “A frequency domain blind signal separation method based on decorrelation,” IEEE Trans. Signal Processing, vol. 50, no. 8, pp. 1855-1865, Aug. 2002.
Article Google Scholar
N. Murata, S. Ikeda, and A. Ziehe, “An approach to blind source separa-tion based on temporal structure of speech signals,” Neurocomputing, vol. 41, no. 1-4, pp. 1-24, Oct. 2001.
Article MATH Google Scholar
J. Anemüller and B. Kollmeier, “Amplitude modulation decorrelation for con-volutive blind source separation,” in Proc. ICA 2000, June 2000, pp. 215-220.
Google Scholar
N. Mitianoudis and M. Davies, “Audio source separation of convolutive mix-tures,” IEEE Trans. Speech and Audio Processing, vol. 11, no. 5, pp. 489-497, Sept. 2003.
Article Google Scholar
F. Asano, S. Ikeda, M. Ogawa, H. Asoh, and N. Kitawaki, “Combined ap-proach of array processing and independent component analysis for blind sepa-ration of acoustic signals,” IEEE Trans. Speech Audio Processing, vol. 11, no. 3, pp. 204-215, May 2003.
Article Google Scholar
H. Saruwatari, S. Kurita, K. Takeda, F. Itakura, T. Nishikawa, and K. Shikano, “Blind source separation combining independent component analysis and beamforming,” EURASIP Journal on Applied Signal Processing, vol. 2003, no. 11, pp. 1135-1146, Nov. 2003.
Article MATH Google Scholar
M. Z. Ikram and D. R. Morgan, “Permutation inconsistency in blind speech separation: Investigation and solutions,” IEEE Trans. Speech Audio Processing, vol. 13, no. 1, pp. 1-13, Jan. 2005.
Article Google Scholar
H. Sawada, R. Mukai, S. Araki, and S. Makino, “A robust and precise method for solving the permutation problem of frequency-domain blind source separa-tion,” IEEE Trans. Speech Audio Processing, vol. 12, no. 5, pp. 530-538, Sept. 2004.
Article Google Scholar
R. Mukai, H. Sawada, S. Araki, and S. Makino, “Frequency-domain blind source separation of many speech signals using near-field and far-field mod-els,” EURASIP Journal on Applied Signal Processing, vol. 2006, pp. Article ID 83683,13 pages, 2006.
Google Scholar
H. Sawada, S. Araki, R. Mukai, and S. Makino, “Blind extraction of dominant target sources using ICA and time-frequency masking,” IEEE Trans. Audio, Speech and Language Processing, pp. 2165-2173, Nov. 2006.
Google Scholar
A. Hiroe, “Solution of permutation problem in frequency domain ICA using multivariate probability density functions,” in Proc. ICA 2006 (LNCS 3889). Springer, Mar. 2006, pp. 601-608.
Google Scholar
T. Kim, H. T. Attias, S.-Y. Lee, and T.-W. Lee, “Blind source separation exploiting higher-order frequency dependencies,” IEEE Trans. Audio, Speech and Language Processing, pp. 70-79, Jan. 2007.
Google Scholar
I. Lee, T. Kim, and T.-W. Lee, “Complex FastIVA: A robust maximum likeli-hood approach of MICA for convolutive BSS,” in Proc. ICA 2006 (LNCS 3889). Springer, Mar. 2006, pp. 625-632.
Google Scholar
J.-F. Cardoso, “Infomax and maximum likelihood for blind source separation,” IEEE Signal Processing Letters, vol. 4, no. 4, pp. 112-114, Apr. 1997.
Article Google Scholar
J. H. DiBiase, H. F. Silverman, and M. S. Brandstein, “Robust localization in reverberant rooms,” in Microphone Arrays, M. Brandstein and D. Ward, Eds. Springer, 2001, pp. 157-180.
Google Scholar
J. Chen, Y. Huang, and J. Benesty, “Time delay estimation,” in Audio Signal Processing, Y. Huang and J. Benesty, Eds. Kluwer Academic Publishers, 2004, pp. 197-227.
Google Scholar
N. Roman and D. Wang, “Binaural sound segregation for multisource rever-berant environments,” in Proc. ICASSP 2004, vol. II, May 2004, pp. 373-376.
Google Scholar
D. Kolossa and R. Orglmeister, “Nonlinear postprocessing for blind speech separation,” in Proc. ICA 2004 (LNCS 3195), Sept. 2004, pp. 832-839.
Google Scholar
M. Aoki, M. Okamoto, S. Aoki, H. Matsui, T. Sakurai, and Y. Kaneda, “Sound source segregation based on estimating incident angle of each frequency com-ponent of input signals acquired by multiple microphones,” Acoustical Science and Technology, vol. 22, no. 2, pp. 149-157, 2001.
Article Google Scholar
S. Rickard, R. Balan, and J. Rosca, “Real-time time-frequency based blind source separation,” in Proc. ICA2001, Dec. 2001, pp. 651-656.
Google Scholar
Ö. Yılmaz and S. Rickard, “Blind separation of speech mixtures via time-frequency masking,” IEEE Trans. Signal Processing, vol.52, no.7, pp. 1830-1847, July 2004.
Article Google Scholar
D. Wang, “On ideal binary mask as the computational goal of auditory scene analysis,” in Speech Separation by Humans and Machines, P. Divenyi, Ed. Kluwer Academic Publishers, 2004, pp. 181-197.
Google Scholar
T. Nakatani, K. Kinoshita, and M. Miyoshi, “Harmonicity-based blind dere-verberation for single-channel speech signals,” IEEE Trans. Audio, Speech and Language Processing, vol. 15, no. 1, pp. 80-95, Jan. 2007.
Article Google Scholar
M. Delcroix, T. Hikichi, and M. Miyoshi, “Precise dereverberation using multi-channel linear prediction,” IEEE Trans. Audio, Speech and Language Process-ing, vol. 15, no. 2, pp. 430-440, Feb. 2007.
Article Google Scholar
J.-F. Cardoso, “Multidimensional independent component analysis,” in Proc. ICASSP 1998, vol. 4, May 1998, pp. 1941-1944.
Google Scholar
T. Kailath, A. H. Sayed, and B. Hassibi, Linear Estimation. Prentice Hall, 2000.
Google Scholar
A. Bell and T. Sejnowski, “An information-maximization approach to blind separation and blind deconvolution,” Neural Computation, vol.7, no.6, pp. 1129-1159, 1995.
Article Google Scholar
A. Papoulis and S. U. Pillai, Probability, Random Variables and Stochastic Processes. McGraw-Hill, 2002.
Google Scholar
S. Amari, A. Cichocki, and H. H. Yang, “A new learning algorithm for blind sig-nal separation,” in Advances in Neural Information Processing Systems, vol. 8. The MIT Press, 1996, pp. 757-763.
Google Scholar
J.-F. Cardoso and B. H. Laheld, “Equivariant adaptive source separation,” IEEE Trans. Signal Processing, vol. 44, no. 12, pp. 3017-3030, Dec. 1996.
Article Google Scholar
H. Sawada, S. Araki, and S. Makino, “Measuring dependence of bin-wise sep-arated signals for permutation alignment in frequency-domain BSS,” in Proc. ISCAS 2007, May 2007, pp. 3247-3250.
Google Scholar
H. Sawada, S. Araki, R. Mukai, and S. Makino, “Solving the permutation prob-lem of frequency-domain BSS when spatial aliasing occurs with wide sensor spacing,” in Proc. ICASSP 2006, vol. V, May 2006, pp. 77-80.
Google Scholar
R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2nd ed. Wiley Interscience, 2000.
Google Scholar

Download references

Author information

Authors and Affiliations

NTT Communication Science Labs, NTT Corporation, 2-4 Hikaridai, 619-0237, Soraku-gun, Kyoto, Japan
Shoji Makino, Hiroshi Sawada & Shoko Araki

Authors

Shoji Makino
View author publications
You can also search for this author in PubMed Google Scholar
Hiroshi Sawada
View author publications
You can also search for this author in PubMed Google Scholar
Shoko Araki
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

NTT Corporation, 2-4 Hikaridai, 619-0237, Soraku-gun, Kyoto, Japan
Shoji Makino & Hiroshi Sawada &
University of California, San Diego, 9500 Gilman Drive, 0523, 92093-0523, La Jolla, CA, USA
Te-Won Lee

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Makino, S., Sawada, H., Araki, S. (2007). Frequency-Domain Blind Source Separation. In: Makino, S., Sawada, H., Lee, TW. (eds) Blind Speech Separation. Signals and Communication Technology. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6479-1_2

Download citation

DOI: https://doi.org/10.1007/978-1-4020-6479-1_2
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-6478-4
Online ISBN: 978-1-4020-6479-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics