Blind Speech Separation pp 271-304 | Cite as

# Underdetermined Blind Source Separation of Convolutive Mixtures by Hierarchical Clustering and L1-Norm Minimization

Chapter

In this chapter we present a complete solution for underdetermined blind source separation (BSS) of convolutive speech mixtures based on two stages. In the first stage, the mixing system is estimated, for which we employ hierarchical clustering. Based on the estimated mixing system, the source signals are estimated in the second stage. The solution for the second stage utilizes the common assumption of independent and identically distributed sources. Modeling the sources by a Laplacian distribution leads to ℓ1-norm minimization.

### Keywords

Microwave Assure Acoustics Estima Maki## Preview

Unable to display preview. Download preview PDF.

### References

- 1.O. Yilmaz and S. Rickard, “Blind separation of speech mixtures via time-frequency masking,” IEEE Transactions on Signal Processing, vol. 52, no. 7, pp. 1830-1847, July 2004. [Online]. Available: http://eleceng.ucd.ie/~srickard/bss.html
- 2.S. Rickard and O. Yilmaz, “On the approximate W-disjoint orthogonality of speech,” in Proc. ICASSP 2002, vol. 1, 2002, pp. 529-532.Google Scholar
- 3.L. Vielva, I. Santamaria, C. Pantaleon, J. Ibanez, and D. Erdogmus, “Estima-tion of the mixing matrix for underdetermined blind source separation using spectral estimation techniques,” in Proc. EUSIPCO 2002, vol. 1, Sept. 2002, pp. 557-560.Google Scholar
- 4.P. Bofill and M. Zibulevsky, “Blind separation of more sources than mixtures using sparsity of their short-time Fourier transform,” in Proc. ICA 2000, June 2000, pp. 87-92.Google Scholar
- 5.P. Bofill, “Underdetermined blind separation of delayed sound sources in the frequency domain,” Neurocomputing, vol. 55, no. 3-4, pp. 627-641, Oct. 2003.CrossRefGoogle Scholar
- 6.S. Araki, S. Makino, A. Blin, R. Mukai, and H. Sawada, “Underdetermined blind separation for speech in real environments with sparseness and ICA,” in Proc. ICASSP 2004, vol. III, May 2004, pp. 881-884.Google Scholar
- 7.A. Blin, S. Araki, and S. Makino, “Underdetermined blind separation of convo-lutive mixtures of speech using time-frequency mask and mixing matrix esti-mation,” IEICE Trans. Fundamentals, vol. E88-A, no. 7, pp. 1693-1700, 2005.CrossRefGoogle Scholar
- 8.K. Waheed and F. Salem, “Algebraic overcomplete independent component analysis,” in Proc. ICA 2003, 2003, pp. 1077-1082.Google Scholar
- 9.F. Theis,“Mathematics in independent component analysis,” Ph.D.dissertation, University of Regensburg,2002.[Online]. Available: http://homepages.uni-regensburg.de/ thf11669/phdthesis.html
- 10.A. Ferréol, L. Albera, and P. Chevalier, “Fourth-order blind identification of underdetermined mixtures of sources (FOBIUM),” IEEE Trans. on Signal Processing, vol. 53, no. 5, pp. 1640-1653, May 2005.CrossRefGoogle Scholar
- 11.L. D. Lathauwer and J. Castaing, “Second-order blind identification of underdetermined mixtures,” in 6th Int. Conference on Independent Component Analysis and Blind Signal Separation (ICA 2006), R. et al., Ed. Justinian ıncipe, and Simon Haykin Charleston, SC, USA: Springer, Mar. 2006, pp. 40-47. [Online]. Available: http://publi-etis.ensea.fr/2006/LC06
- 12.L. Albera, P. Comon, P. Chevalier, and A. Ferrol, “Blind identification of un-derdetermined mixtures based on the hexacovariance,” in Proc. ICASSP 2004, vol. II, May 2004, pp. 29-32.Google Scholar
- 13.P. Bofill and E. Monte, “Underdetermined convoluted source reconstruction using lp and socp, and a neural approximator of the optimizer,” in Indepen-dent Component Analysis and Blind Signal Separation, ser. LNCS, vol. 3889. Springer, 2006, pp. 569-576.Google Scholar
- 14.Y. Deville, J. Chappuis, S. Hosseini, and J. Thomas, “Differential fast fixed-point bss for underdetermined linear instantaneous mixtures,” in Indepen-dent Component Analysis and Blind Signal Separation, ser. LNCS, vol. 3889. Springer, 2006, pp. 48-56.Google Scholar
- 15.C. Wei, L. Khor, W. Woo, and S. Dlay, “Post-nonlinear underdetermined ICA by Bayesian statistics,” in Independent Component Analysis and Blind Signal Separation, ser. LNCS, vol. 3889. Springer, 2006, pp. 773-780.Google Scholar
- 16.S. Lesage, S. Krstulović, and R. Gribonval, “Under-determined source sep-aration: Comparison of two approaches based on sparse decompositions,” in Independent Component Analysis and Blind Signal Separation, ser. LNCS, vol. 3889. Springer, 2006, pp. 633-640.Google Scholar
- 17.C. Févotte and S. Godsill, “Blind separation of sparse sources using jeffrey’s inverse prior and the em algorithm,” in Independent Component Analysis and Blind Signal Separation, ser. LNCS, vol. 3889. Springer, 2006, pp. 593-600.Google Scholar
- 18.P. Comon and M. Rajih, “Blind identification of under-determined mixtures based on the characteristic function,” in ICASSP’05, vol. IV, Mar. 2005, pp. 1005-1008.Google Scholar
- 19.L. Albera, A. Ferreol, P. Comon, and P. Chevalier, “Blind Identification of Overcomplete MixturEs of sources (BIOME),” Linear Algebra Applications, Special Issue on Linear Algebra in Signal and Image Processing, vol. 391C, pp. 3-30, Nov. 2004.MathSciNetGoogle Scholar
- 20.L. D. Lathauwer, “Simultaneous matrix diagonalization: the overcomplete case,” in Proc. of the Fourth International Symposium on Independent Compo-nent Analysis and Blind Signal Separation (ICA 2003), Apr. 2003, pp. 821-825.Google Scholar
- 21.L. D. Lathauwer, B. D. Moor, J. Vandewalle, and J.-F. Cardoso, “Indepen-dent component analysis of largely underdetermined mixtures,” in Proc. of the Fourth International Symposium on Independent Component Analysis and Blind Signal Separation (ICA 2003), Apr. 2003, pp. 29-34.Google Scholar
- 22.L. Vielva, D. Erdogmus, C. Pantaleon, I. Santamaria, J. Pereda, and J. Principe, “Underdetermined blind source separation in a time-varying environment,” in Proc. ICASSP 2002, vol. 3, May 2002, pp. 3049-3052.Google Scholar
- 23.L. D. Lathauwer, P. Comon, B. D. Moor, and J. Vandewalle, “ICA algorithms for 3 sources and 2 sensors,” in Proc. IEEE Signal Processing Workshop on Higher-Order Statistics, Caesarea, Israel, 1999, pp. 116-120.Google Scholar
- 24.P. OGrady, B. Pearlmutter, and S. Rickard, “Survey of sparse and non-sparse methods in source separation,” International Journal of Imaging Systems and Technology, vol. 15, no. 1, pp. 18-33, July 2005.CrossRefGoogle Scholar
- 25.F. Abrard and Y. Deville, “A time-frequency blind signal separation method applicable to underdetermined mixtures of dependent sources,” Signal Processing, vol. 85, no. 7, pp. 1389-1403, July 2005. [Online]. Available: http://www.ast.obs-mip.fr/users/ydeville/papers/fa yd sigpro 2005 final%.pdf
- 26.N. Mitianoudis and T. Stathaki, “Overcomplete source separation using lapla-cian mixure models,” IEEE Signal Processing Letters, vol. 12, no. 4, pp. 277-280, Apr. 2005.CrossRefGoogle Scholar
- 27.S. Araki, H. Sawada, R. Mukai, and S. Makino, “A novel blind source separation method with observation vector clustering,” in Proc. IWAENC 2005, Sept. 2005, pp. 117-120.Google Scholar
- 28.R. Olsson and L. Hansen, “Blind separation of more sources than sensors in convolutive mixtures,” in Proc. ICASSP 2006, 2006.Google Scholar
- 29.M. Pedersen, D. Wang, J. Larsen, and U. Kjems, “Separating underdetermined convolutive speech mixtures,” in Independent Component Analysis and Blind Signal Separation, ser. LNCS, vol. 3889. Springer, 2006, pp. 674-681.Google Scholar
- 30.Y. Li, J. Wang, and A. Cichocki, “Blind source extraction from convolutive mixtures in ill-conditioned multi-input multi-output channels,” IEEE Trans. on Circuits and Systems - I: Regular Papers, vol. 51, no. 9, pp. 1814-1822, Sept. 2004.CrossRefMathSciNetGoogle Scholar
- 31.R. Saab, O. Yilmaz, M. McKeown, and R. Abugharbieh, “Underdetermined sparse blind source separation with delays,” in Signal Processing with Adaptive Sparse Structured Representations Workshop (SPARS), 2005.Google Scholar
- 32.M. Molla, K. Hirose, and N. Minematsu, “Separation of mixed audio signals by source localization and binary masking with hilbert spectrum,” in Indepen-dent Component Analysis and Blind Signal Separation, ser. LNCS, vol. 3889. Springer, 2006, pp. 641-648.Google Scholar
- 33.S. J. Godsill and C. Andrieu, “Bayesian separation and recovery of convolutively mixed autoregressive sources,” in Proc. ICASSP 1999, vol. III, 1999, pp. 1733-1736. [Online]. Available: citeseer.csail.mit.edu/349030.htmlGoogle Scholar
- 34.S. Winter, H. Sawada, S. Araki, and S. Makino, “Overcomplete BSS for con-volutive mixtures based on hierarchical clustering,” in Proc. ICA 2004, Sept. 2004, pp. 652-660.Google Scholar
- 35.S. Winter, H. Sawada, and S. Makino, “On real and complex valued L1-norm minimization for overcomplete blind source separation,” in 2005 IEEE Work-shop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, 2005, pp. 86-89.Google Scholar
- 36.L. Vielva, D. Erdogmus, and J. C. Principe, “Underdetermined blind source separation using a probabilistic source sparsity model,” in Proc. ICA 2001, 2001, pp. 675-679.Google Scholar
- 37.W. Kellermann and H. Buchner, “Wideband algorithms versus narrowband algorithms for adaptive filtering in the DFT domain,” in Proc. Asilomar Conf. on Signals, Systems, and Computers, vol. 2, Nov. 2003, pp. 1278-1282.Google Scholar
- 38.N. Linh-Trung, A. Belouchrani, K. Abed-Meraim, and B. Boashash, “Separat-ing more sources than sensors using time-frequency distributions,” EURASIP Journal on Applied Signal Processing, vol. 2005, no. 17, pp. 2828-2847, 2005.MATHCrossRefGoogle Scholar
- 39.H. Sawada, S. Araki, R. Mukai, and S. Makino, “Blind extraction of a dominant source signal from mixtures of many sources,” in Proc. ICASSP 2005, vol. III, 2005, pp. 61-64.Google Scholar
- 40.H. Sawada, R. Mukai, S. Araki, and S. Makino, “A robust and precise method for solving the permutation problem,” IEEE Trans. Speech and Audio Process-ing, vol. 12, pp. 530-538, Sept. 2004.CrossRefGoogle Scholar
- 41.K. Matsuoka, “Independent component analysis and its applications to sound signal separation,” in Proc. IWAENC 2003, Kyoto, Sept. 2003, pp. 15-18.Google Scholar
- 42.A. Jourjine, S. Rickard, and O. Yilmaz, “Blind separation of disjoint orthogonal signals: Demixing n sources from 2 mixtures,” Proc. ICASSP 2000, vol. 5, pp. 2985-2988, 2000.Google Scholar
- 43.M. Pedersen, T. Lehn-Schiøler, and J. Larsen, “BLUES from music: BLind Un-derdetermined Extraction of Sources from music,” in Independent Component Analysis and Blind Signal Separation, ser. LNCS, vol. 3889. Springer, 2006, pp. 392-399.Google Scholar
- 44.A. Mansour, M. Kawamoto, and C. Puntonet, “A time-frequency approach to blind separation of underdetermine mixture of sources,” in Proc. IASTED International Conference Applied Simuation and Modelling, Sept. 2003, pp. 413-418.Google Scholar
- 45..M. Zibulevsky and B. Pearlmutter,“Blind source separation by sparse decomposition,” Neural Computations, vol.13, no.4, pp.863-882,2001. [Online]. Available: http://iew3.technion.ac.il/~mcib/
- 46.S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic decomposition by basis pursuit,” Dept. Stat., Stanford Univ, Stanford, CA, Tech. Rep., 1995. [Online]. Available: http://www-stat.stanford.edu/~ donoho/Reports/1995/30401.pdf
- 47.P. Comon, “Blind channel identification and extraction of more sources than sensors,” in Proc. SPIE, 1998, pp. 2-13, keynote address.Google Scholar
- 48.A. Taleb, “An algorithm for the blind identication of N independent signal with 2 sensors,” in Proc. ISSPA 01, Aug. 2001, pp. 5-8.Google Scholar
- 49.J.-F. Cardoso, “Super-symmetric decomposition of the fourth-order cumulant tensor blind identification of more sources than sensors,” in Proc. ICASSP 91, vol. V, 1991, pp. 3109-3112.Google Scholar
- 50.L. Khor, W. Woo, and S. Dlay, “Non-sparse approach to underdetermined blind signal estimation,” in Proc. ICASSP 2005, 2005.Google Scholar
- 51.L. Benaroya, F. Bimbot, and R. Gribonval, “Audio source separation with a single sensor,” IEEE Trans. Audio, Speech and Language Processing, vol. 14, no. 1, pp. 191-199, Jan. 2006.CrossRefGoogle Scholar
- 52.T. Beierholm, B. Pedersen, and O. Winther, “Low complexity Bayesian single channel source separation,” in Proc. ICASSP 2003, 2003.Google Scholar
- 53.D. Ellis, “Prediction-driven computational auditory scene analysis,” Ph.D. dis-sertation, MIT, 1996.Google Scholar
- 54.J. Burred and T. Sikora, “On the use of auditory representations for sparsity-based sound source separation,” in Proc. IEEE Fifth Int. Conf. on Informa-tion, Communications and Signal Processing (ICICS), Bangkok, Thailand, Dec. 2005.Google Scholar
- 55.A. Hyvärinen, J. Karhunen, and E. Oja, Independent Component Analysis. New York: John Wiley & Sons, 2000.Google Scholar
- 56.F. Theis and E. Lang, “Formalization of the two-step approach to overcomplete BSS,” in Proc. of SIP 2002, Kauai, Hawaii, USA, 2002, pp. 207-212. [Online]. Available: http://homepages.uni-regensburg.de/~thf11669/ publications/theis02twostep SIP02.pdf
- 57.K. Waheed, “Blind source recovery: state space formulations,” Department of Electrical and Computer Engineering, Michigan State University, Tech. Rep., Sept. 2001.Google Scholar
- 58.P. Georgiev, P. G., D. Nuzillard, and A. Ralescu, “Sparse deflations in blind signal separation,” in Independent Component Analysis and Blind Signal Sep-aration, ser. LNCS, vol. 3889. Springer, 2006, pp. 807-814.Google Scholar
- 59.Y. Luo, W. Wang, J. Chambers, S. Lambotharan, and I. Proudler, “Exploita-tion of source nonstationarity in underdetermined blind source separation with advanced clustering techniques,” IEEE Trans. Signal Processing, vol. 54, no. 6, pp. 2198-2212, June 2006.CrossRefGoogle Scholar
- 60.C. Chang, P. C. Fung, and Y. S. Hung, “On a sparse component analysis approach to blind source separation,” in Independent Component Analysis and Blind Signal Separation, ser. LNCS, vol. 3889. Springer, 2006, pp. 765-772.Google Scholar
- 61.B. A. Pearlmutter and V. K. Potluru, “Sparse separation: Principles and tricks,” in Proc SPIE, vol. 5102, Apr. 2003, pp. 1-4.Google Scholar
- 62.I. Gorodnitsky and B. Rao, “Sparse signal reconstruction from limited data using FOCUSS: a re-weighted minimum norm algorithm,” IEEE Trans. Signal Processing, vol. 45, no. 3, pp. 600-616, Mar. 1997.CrossRefGoogle Scholar
- 63.T. Kristjansson, J. Hershey, and H. Attias, “Single microphone source separa-tion using high resolution signal reconstruction,” in Proc. ICASSP 2004, 2004.Google Scholar
- 64.A. Nesbit, M. Davies, M. Plumbley, and M. Sandler, “Source extraction from two-channel mixtures by joint cosine packet analysis,” in Proc. EUSICPO 2006, 2006.Google Scholar
- 65.L. D. Lathauwer, B. D. Moor, and J. Vandewalle, “Ica techniques for more sources than sensors,” in Proc. HOS 99, Caesarea, Israel, June 1999, pp. 121-124.Google Scholar
- 66.P. Comon and O. Grellier, “Non-linear inversion of underdetermined mixtures,” in Proc. ICA 99, 1999, pp. 461-465.Google Scholar
- 67.P. Comon, “Blind identification and source separation in 2x3 under-determined mixtures,” IEEE Trans. Signal Processing, vol. 52, no. 1, pp. 11-22, Jan. 2004.CrossRefMathSciNetGoogle Scholar
- 68.C. M. Bishop, Neural Networks for Pattern Recognition. Oxford University Press, 1995.Google Scholar
- 69.A. Gelman, J. Carlin, H. Stern, and D. Rubin, Bayesian Data Analysis. Chap-man & Hall, 1995.Google Scholar
- 70.D. Donoho and M. Elad,“Optimally-sparse representation in general (non-orthogonal) dictionaries via l1 minimization,” Proc. Nat. Aca. Sci, vol. 100, no.5, pp.2197-2202, Mar.2003.[Online]. Available: http://www.pnas.org/cgi/reprint/100/5/2197.pdf
- 71.T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learn-ing: Data Mining, Inference, and Prediction, ser. Springer Series in Statistics. Springer-Verlag, 2002.Google Scholar
- 72.F. Murtagh, “Comments on ‘Parallel algorithms for hierarchical clustering and cluster validity’,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 14, no. 10, pp. 1056-1057, Oct. 1992.CrossRefGoogle Scholar
- 73.A. Papoulis and S. Pillai, Probability, Random Variables, and Stochastic Processes, 4th ed. McGraw-Hill, 2002.Google Scholar
- 74.A. Pruessner, M. Bussieck, S. Dirkse, and A. Meeraus, “Conic programming in GAMS,” in INFORMS Annual Meeting, Atlanta, Oct. 2003, pp. 19-22. [Online]. Available: http://www.gams.com/presentations/present conic.pdf
- 75.J. Sturm, “Using SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones,” Optimization Methods and Software, vol. 11-12, pp. 625-653, 1999, special issue on Interior Point Methods. [Online]. Available: http://fewcal.kub.nl/sturm/software/sedumi.html
- 76.L. S. Lobo, L. Vandenberghe, S. Boyd, and H. Lebert, “Second order cone programming,” Linear Algebra and Its Applications, vol. 284, pp. 193-228, 1998.MATHCrossRefMathSciNetGoogle Scholar
- 77.F. Alizadeh and D. Goldfarb, “Second-order cone programming,” Rugers Uni-versity, Tech. Rep., 2001.Google Scholar
- 78.S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge University Press, 2004.Google Scholar
- 79.M. Lewicki and T. Sejnowski,“Learning overcomplete representations,” Neural Computation, vol. 12, no. 2, pp. 337-365, 2000. [Online]. Available: citeseer.nj.nec.com/lewicki98learning.htmlGoogle Scholar
- 80.I. Takigawa, M. Kudo, and J. Toyama, “Performance analysis of minimum ℓ1 -norm solutions for underdetermined source separation,” IEEE Trans. Signal Processing, vol. 52, no. 3, pp. 582-591, Mar. 2004.CrossRefMathSciNetGoogle Scholar
- 81.D. Malioutov, M. Cetin, and A. Willsky, “Optimal sparse representations in general overcomplete bases,” in Proc. ICASSP 2004, 2004, pp. 793-796.Google Scholar
- 82.E. Vincent, R. Gribonval, and C. Févotte, “Performance measurement in blind audio source separation,” IEEE Trans. Speech, Audio and Language Processing, vol. 14, no. 4, pp. 1462-1469, Jul. 2006.CrossRefGoogle Scholar
- 83.C. Févotte, R. Gribonval, and E. Vincent, “BSS EVAL toolbox user guide - Revision2.0,” IRISA, Tech. Rep.1706, Apr.2005.[Online]. Available: http://bass-db.gforge.inria.fr/bss eval/

## Copyright information

© Springer 2007