Skip to main content

Frequency-Domain Blind Source Separation

  • Chapter
Blind Speech Separation

Part of the book series: Signals and Communication Technology ((SCT))

This chapter explains the frequency-domain approach to the blind source separation of acoustic signals mixed in a real room environment. With the application of short-time Fourier transforms, convolutive mixtures in the time domain can be approximated as multiple instantaneous mixtures in the frequency domain. So, separation is performed in each frequency bin with a simple instantaneous separation matrix.We employ complex-valued independent component analysis (ICA) to calculate the separation matrix. Then, the permutation ambiguity of the ICA solutions should be aligned so that the separated signals are constructed properly in the time domain. We estimate the time difference of arrival (TDOA) of a source at microphones from the ICA solutions. The frequency-dependent TDOA estimations are then clustered in order to align the permutation ambiguities. We also consider the use of time–frequency masking for a case where the separation by linear filters is insufficient when the sources outnumber the microphones. Experimental results are shown for a simple 3-source 3-microphone case, and also for a rather complicated case with many background interference signals.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. T. W. Lee, Independent Component Analysis - Theory and Applications. Kluwer Academic Publishers, 1998.

    Google Scholar 

  2. S. Haykin, Ed., Unsupervised Adaptive Filtering (Volume I: Blind Source Sep-aration). John Wiley & Sons, 2000.

    Google Scholar 

  3. A. Hyvärinen, J. Karhunen, and E. Oja, Independent Component Analysis. John Wiley & Sons, 2001.

    Google Scholar 

  4. A. Cichocki and S. Amari, Adaptive Blind Signal and Image Processing. John Wiley & Sons, 2002.

    Google Scholar 

  5. S. Amari, S. Douglas, A. Cichocki, and H. Yang, “Multichannel blind deconvo-lution and equalization using the natural gradient,” in Proc. IEEE Workshop on Signal Processing Advances in Wireless Communications, Apr. 1997, pp. 101-104.

    Google Scholar 

  6. M. Kawamoto, K. Matsuoka, and N. Ohnishi, “A method of blind separation for convolved non-stationary signals,” Neurocomputing, vol. 22, pp. 157-171, 1998.

    Article  MATH  Google Scholar 

  7. K. Matsuoka and S. Nakashima, “Minimal distortion principle for blind source separation,” in Proc. ICA 2001, Dec. 2001, pp. 722-727.

    Google Scholar 

  8. S. C. Douglas and X. Sun, “Convolutive blind separation of speech mixtures using the natural gradient,” Speech Communication, vol. 39, pp. 65-78, 2003.

    Article  MATH  Google Scholar 

  9. H. Buchner, R. Aichner, and W. Kellermann, “Blind source separation for con-volutive mixtures: A unified treatment,” in Audio Signal Processing for NextGeneration Multimedia Communication Systems, Y. Huang and J. Benesty, Eds. Kluwer Academic Publishers, Feb. 2004, pp. 255-293.

    Google Scholar 

  10. T. Takatani, T. Nishikawa, H. Saruwatari, and K. Shikano, “High-fidelity blind separation of acoustic signals using SIMO-model-based independent component analysis,” IEICE Trans. Fundamentals, vol. E87-A, no. 8, pp. 2063-2072, Aug. 2004.

    Google Scholar 

  11. S. C. Douglas, H. Sawada, and S. Makino, “A spatio-temporal FastICA algo-rithm for separating convolutive mixtures,” in Proc. ICASSP 2005, vol. V, Mar. 2005, pp. 165-168.

    Google Scholar 

  12. R. Aichner, H. Buchner, F. Yan, and W. Kellermann, “A real-time blind source separation scheme and its application to reverberant and noisy acoustic envi-ronments,” Signal Process., vol. 86, no. 6, pp. 1260-1277, 2006.

    Article  MATH  Google Scholar 

  13. P. Smaragdis, “Blind separation of convolved mixtures in the frequency domain,” Neurocomputing, vol. 22, pp. 21-34, 1998.

    Article  MATH  Google Scholar 

  14. L. Parra and C. Spence, “Convolutive blind separation of non-stationary sources,” IEEE Trans. Speech Audio Processing, vol. 8, no. 3, pp. 320-327, May 2000.

    Article  Google Scholar 

  15. L. Schobben and W. Sommen, “A frequency domain blind signal separation method based on decorrelation,” IEEE Trans. Signal Processing, vol. 50, no. 8, pp. 1855-1865, Aug. 2002.

    Article  Google Scholar 

  16. N. Murata, S. Ikeda, and A. Ziehe, “An approach to blind source separa-tion based on temporal structure of speech signals,” Neurocomputing, vol. 41, no. 1-4, pp. 1-24, Oct. 2001.

    Article  MATH  Google Scholar 

  17. J. Anemüller and B. Kollmeier, “Amplitude modulation decorrelation for con-volutive blind source separation,” in Proc. ICA 2000, June 2000, pp. 215-220.

    Google Scholar 

  18. N. Mitianoudis and M. Davies, “Audio source separation of convolutive mix-tures,” IEEE Trans. Speech and Audio Processing, vol. 11, no. 5, pp. 489-497, Sept. 2003.

    Article  Google Scholar 

  19. F. Asano, S. Ikeda, M. Ogawa, H. Asoh, and N. Kitawaki, “Combined ap-proach of array processing and independent component analysis for blind sepa-ration of acoustic signals,” IEEE Trans. Speech Audio Processing, vol. 11, no. 3, pp. 204-215, May 2003.

    Article  Google Scholar 

  20. H. Saruwatari, S. Kurita, K. Takeda, F. Itakura, T. Nishikawa, and K. Shikano, “Blind source separation combining independent component analysis and beamforming,” EURASIP Journal on Applied Signal Processing, vol. 2003, no. 11, pp. 1135-1146, Nov. 2003.

    Article  MATH  Google Scholar 

  21. M. Z. Ikram and D. R. Morgan, “Permutation inconsistency in blind speech separation: Investigation and solutions,” IEEE Trans. Speech Audio Processing, vol. 13, no. 1, pp. 1-13, Jan. 2005.

    Article  Google Scholar 

  22. H. Sawada, R. Mukai, S. Araki, and S. Makino, “A robust and precise method for solving the permutation problem of frequency-domain blind source separa-tion,” IEEE Trans. Speech Audio Processing, vol. 12, no. 5, pp. 530-538, Sept. 2004.

    Article  Google Scholar 

  23. R. Mukai, H. Sawada, S. Araki, and S. Makino, “Frequency-domain blind source separation of many speech signals using near-field and far-field mod-els,” EURASIP Journal on Applied Signal Processing, vol. 2006, pp. Article ID 83683,13 pages, 2006.

    Google Scholar 

  24. H. Sawada, S. Araki, R. Mukai, and S. Makino, “Blind extraction of dominant target sources using ICA and time-frequency masking,” IEEE Trans. Audio, Speech and Language Processing, pp. 2165-2173, Nov. 2006.

    Google Scholar 

  25. A. Hiroe, “Solution of permutation problem in frequency domain ICA using multivariate probability density functions,” in Proc. ICA 2006 (LNCS 3889). Springer, Mar. 2006, pp. 601-608.

    Google Scholar 

  26. T. Kim, H. T. Attias, S.-Y. Lee, and T.-W. Lee, “Blind source separation exploiting higher-order frequency dependencies,” IEEE Trans. Audio, Speech and Language Processing, pp. 70-79, Jan. 2007.

    Google Scholar 

  27. I. Lee, T. Kim, and T.-W. Lee, “Complex FastIVA: A robust maximum likeli-hood approach of MICA for convolutive BSS,” in Proc. ICA 2006 (LNCS 3889). Springer, Mar. 2006, pp. 625-632.

    Google Scholar 

  28. J.-F. Cardoso, “Infomax and maximum likelihood for blind source separation,” IEEE Signal Processing Letters, vol. 4, no. 4, pp. 112-114, Apr. 1997.

    Article  Google Scholar 

  29. J. H. DiBiase, H. F. Silverman, and M. S. Brandstein, “Robust localization in reverberant rooms,” in Microphone Arrays, M. Brandstein and D. Ward, Eds. Springer, 2001, pp. 157-180.

    Google Scholar 

  30. J. Chen, Y. Huang, and J. Benesty, “Time delay estimation,” in Audio Signal Processing, Y. Huang and J. Benesty, Eds. Kluwer Academic Publishers, 2004, pp. 197-227.

    Google Scholar 

  31. N. Roman and D. Wang, “Binaural sound segregation for multisource rever-berant environments,” in Proc. ICASSP 2004, vol. II, May 2004, pp. 373-376.

    Google Scholar 

  32. D. Kolossa and R. Orglmeister, “Nonlinear postprocessing for blind speech separation,” in Proc. ICA 2004 (LNCS 3195), Sept. 2004, pp. 832-839.

    Google Scholar 

  33. M. Aoki, M. Okamoto, S. Aoki, H. Matsui, T. Sakurai, and Y. Kaneda, “Sound source segregation based on estimating incident angle of each frequency com-ponent of input signals acquired by multiple microphones,” Acoustical Science and Technology, vol. 22, no. 2, pp. 149-157, 2001.

    Article  Google Scholar 

  34. S. Rickard, R. Balan, and J. Rosca, “Real-time time-frequency based blind source separation,” in Proc. ICA2001, Dec. 2001, pp. 651-656.

    Google Scholar 

  35. Ö. Yılmaz and S. Rickard, “Blind separation of speech mixtures via time-frequency masking,” IEEE Trans. Signal Processing, vol.52, no.7, pp. 1830-1847, July 2004.

    Article  Google Scholar 

  36. D. Wang, “On ideal binary mask as the computational goal of auditory scene analysis,” in Speech Separation by Humans and Machines, P. Divenyi, Ed. Kluwer Academic Publishers, 2004, pp. 181-197.

    Google Scholar 

  37. T. Nakatani, K. Kinoshita, and M. Miyoshi, “Harmonicity-based blind dere-verberation for single-channel speech signals,” IEEE Trans. Audio, Speech and Language Processing, vol. 15, no. 1, pp. 80-95, Jan. 2007.

    Article  Google Scholar 

  38. M. Delcroix, T. Hikichi, and M. Miyoshi, “Precise dereverberation using multi-channel linear prediction,” IEEE Trans. Audio, Speech and Language Process-ing, vol. 15, no. 2, pp. 430-440, Feb. 2007.

    Article  Google Scholar 

  39. J.-F. Cardoso, “Multidimensional independent component analysis,” in Proc. ICASSP 1998, vol. 4, May 1998, pp. 1941-1944.

    Google Scholar 

  40. T. Kailath, A. H. Sayed, and B. Hassibi, Linear Estimation. Prentice Hall, 2000.

    Google Scholar 

  41. A. Bell and T. Sejnowski, “An information-maximization approach to blind separation and blind deconvolution,” Neural Computation, vol.7, no.6, pp. 1129-1159, 1995.

    Article  Google Scholar 

  42. A. Papoulis and S. U. Pillai, Probability, Random Variables and Stochastic Processes. McGraw-Hill, 2002.

    Google Scholar 

  43. S. Amari, A. Cichocki, and H. H. Yang, “A new learning algorithm for blind sig-nal separation,” in Advances in Neural Information Processing Systems, vol. 8. The MIT Press, 1996, pp. 757-763.

    Google Scholar 

  44. J.-F. Cardoso and B. H. Laheld, “Equivariant adaptive source separation,” IEEE Trans. Signal Processing, vol. 44, no. 12, pp. 3017-3030, Dec. 1996.

    Article  Google Scholar 

  45. H. Sawada, S. Araki, and S. Makino, “Measuring dependence of bin-wise sep-arated signals for permutation alignment in frequency-domain BSS,” in Proc. ISCAS 2007, May 2007, pp. 3247-3250.

    Google Scholar 

  46. H. Sawada, S. Araki, R. Mukai, and S. Makino, “Solving the permutation prob-lem of frequency-domain BSS when spatial aliasing occurs with wide sensor spacing,” in Proc. ICASSP 2006, vol. V, May 2006, pp. 77-80.

    Google Scholar 

  47. R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification, 2nd ed. Wiley Interscience, 2000.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer

About this chapter

Cite this chapter

Makino, S., Sawada, H., Araki, S. (2007). Frequency-Domain Blind Source Separation. In: Makino, S., Sawada, H., Lee, TW. (eds) Blind Speech Separation. Signals and Communication Technology. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6479-1_2

Download citation

  • DOI: https://doi.org/10.1007/978-1-4020-6479-1_2

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-1-4020-6478-4

  • Online ISBN: 978-1-4020-6479-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics