Speech Separation via Parallel Factor Analysis of Cross-Frequency Covariance Tensor

Gong, Xiao-Feng; Lin, Qiu-Hua

doi:10.1007/978-3-642-15995-4_9

Xiao-Feng Gong²¹ &
Qiu-Hua Lin²¹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6365))

Included in the following conference series:

International Conference on Latent Variable Analysis and Signal Separation

3153 Accesses
3 Citations

Abstract

This paper considers separation of convolutive speech mixtures in frequency-domain within a tensorial framework. By assuming that components associated with neighboring frequency bins of the same source are still correlated, a set of cross-frequency covariance tensors with trilinear structure are established, and an algorithm consisting of consecutive parallel factor (PARAFAC) decompositions is developed. Each PARAFAC decompositon used in the proposed method can simultaneously estimate two neighboring frequency responses, one of which is a common factor with the subsequent cross-frequency covariance tensor, and thus could be used to align the permutations of the estimates in all the PARAFAC decompositions. In addition, the issue of identifiability is addressed, and simulations with synthetic speech signals are provided to verify the efficacy of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Pedersen, M.S., Larsen, J., Kjems, U., Parra, L.C.: A survey of convolutive blind source separation methods. Springer Handbook on Speech Processing and Speech Communication, 1–34 (2007)
Google Scholar
Murata, N., Ikeda, S., Ziehe, A.: An approach to blind source separation based on temporal structure of speech signal. Neurocomputing 41, 1–24 (2001)
Article MATH Google Scholar
Wang, L.D., Lin, Q.H.: Frequency-domain blind separation of convolutive speech mixturees with energy correlation-based permutation correction. In: Zhang, L., Lu, B.-L., Kwok, J. (eds.) ISNN 2010. LNCS, vol. 6063, Springer, Heidelberg (2010)
Google Scholar
Parra, L.C., Spence, C.: Convolutive blind separation of non-stationary sources. IEEE Transactions on Speech and Audio Processing 8, 320–327 (2000)
Article Google Scholar
Nion, D., Mokios, K. N., Sidiropoulos, N. D., Potamianos, A.C.: Batch and adaptive PARAFAC-based blind separation of convolutive speech mixtures. IEEE Transactions on Audio, Speech and Language Processing (to appear)
Google Scholar
Serviere, C., Pham, D.T.: Permutation correction in the frequency domain in blind separation of speech mixtures. EURASIP Journal on Applied Signal Processing Article ID 75206, 1–16 (2006)
Google Scholar
Sidiropoulos, N.D., Bro, R., Giannakis, G.B.: Parallel factor analysis in sensor array processing. IEEE Transactions on Signal Processing 48, 2377–2388 (2000)
Article Google Scholar
Tomasi, G., Bro, R.: A comparison of algorithms for fitting the PARAFAC model. Computational Statistics and Data Analysis 50, 1700–1734 (2006)
Article MATH MathSciNet Google Scholar
Sidiropoulos, N.D., Giannakis, G.B., Bro, R.: Blind PARAFAC receivers for DS-CDMA systems. IEEE Transactions on Signal Processing 48, 810–823 (2000)
Article Google Scholar
De Lathauwer, L., Castaing, J.: Blind identification of underdetermined mixtures by simultaneous matrix diagonalization. IEEE Transactions on Signal Processing 56, 1096–1105 (2008)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

School of Information and Communication Engineering, Dalian University of Technology, Dalian, 116024, China
Xiao-Feng Gong & Qiu-Hua Lin

Authors

Xiao-Feng Gong
View author publications
You can also search for this author in PubMed Google Scholar
Qiu-Hua Lin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Electrical Engineering, Universitè d’Evry Val d’Essone, 40 rue du Pelvoux, 91020, Courcouronnes, France
Vincent Vigneron
Laboratoire I3S, Les Algorithmes - Euclide-B, BP 121, Université de Nice-Sophia Antipolis, 2000 Route des Lucioles, 06903, Sophia Antipolis Cedex, France
Vicente Zarzoso
School of Engineering, Dept. of Telecommunications, ISITSchool of Engineering, Dept. of Telecommunications, ISITV, Université de Toulon, Avenue George Pompidou, BP 56, La Valette du Var, Cedex, 83162, France
Eric Moreau
INRIA France, Equipe-projet METISS, Centre de Recherche INRIA Rennes-Bretagne Atlantique, Campus de Beaulieu, 35042, Rennes cedex, France
Rémi Gribonval
INRIA France, Equipe-projet METISS, Centre de Recherche INRIA Rennes-Bretagne Atlantique, Campus de Beaulieu, 35042, Rennes Cedex, France
Emmanuel Vincent

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gong, XF., Lin, QH. (2010). Speech Separation via Parallel Factor Analysis of Cross-Frequency Covariance Tensor. In: Vigneron, V., Zarzoso, V., Moreau, E., Gribonval, R., Vincent, E. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2010. Lecture Notes in Computer Science, vol 6365. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15995-4_9

Download citation

DOI: https://doi.org/10.1007/978-3-642-15995-4_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15994-7
Online ISBN: 978-3-642-15995-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics