Bayesian Audio Source Separation

Févotte, Cédric

doi:10.1007/978-1-4020-6479-1_11

Cédric Févotte³

Part of the book series: Signals and Communication Technology ((SCT))

2467 Accesses
7 Citations

In this chapter we describe a Bayesian approach to audio source separation. The approach relies on probabilistic modeling of sound sources as (sparse) linear combinations of atoms from a dictionary and Markov chain Monte Carlo (MCMC) inference. Several prior distributions are considered for the source expansion coefficients. We first consider independent and identically distributed (iid) general priors with two choices of distributions. The first one is the Student t, which is a good model for sparsity when the shape parameter has a low value. The second one is a hierarchical mixture distribution; conditionally upon an indicator variable, one coefficient is either set to zero or given a normal distribution, whose variance is in turn given an inverted-Gamma distribution. Then, we consider more audiospecific models where both the identically distributed and independently distributed assumptions are lifted. Using a Modified Discrete Cosine Transform (MDCT) dictionary, a time–frequency orthonormal basis, we describe frequency-dependent structured priors which explicitly model the harmonic structure of sound, using a Markov hierarchical modeling of the expansion coefficients. Separation results are given for a stereophonic recording of three sources.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

K. H. Knuth, “Bayesian source separation and localization,” in SPIE’98: Bayesian Inference for Inverse Problems, San Diego, Jul. 1998, pp. 147-158.
Google Scholar
——, “A Bayesian approach to source separation,” in Proc. 1st International Workshop on Independent Component Analysis and Signal Separation, Aussois, France, Jan. 1999, pp. 283-288.
Google Scholar
A. Mohammad-Djafari, “A Bayesian approach to source separation,” in Proc. 19th International Workshop on Bayesian Inference and Maximum Entropy Methods (MaxEnt99), Boise, USA, Aug. 1999.
Google Scholar
A. J. Bell and T. J. Sejnowski, “An information-maximization approach to blind separation and blind deconvolution,” Neural Computation, vol. 7, no. 6, pp. 1129-1159, 1995.
Article Google Scholar
J.-F. Cardoso, “Blind signal separation: statistical principles,” Proceedings of the IEEE. Special issue on blind identification and estimation, vol. 9, no. 10, pp. 2009-2025, Oct. 1998.
Google Scholar
B. A. Olshausen and K. J. Millman, “Learning sparse codes with a mixture-of-Gaussians prior,” in Advances in Neural Information Processing Systems, S. A. Solla and T. K. Leen, Eds. MIT press, 2000, pp. 841-847.
Google Scholar
M. S. Lewicki and T. J. Sejnowski, “Learning overcomplete representations,” Neural Computations, vol. 12, pp. 337-365, 2000.
Article Google Scholar
M. Girolami, “A variational method for learning sparse and overcomplete rep-resentations,” Neural Computation, vol. 13, no. 11, pp. 2517-2532, 2001.
Article MATH Google Scholar
T.-W. Lee, M. S. Lewicki, M. Girolami, and T. J. Sejnowski, “Blind source separation of more sources than mixtures using overcomplete representations,” IEEE Signal Processing Letters, vol. 4, no. 4, Apr. 1999.
Google Scholar
M. Zibulevsky, B. A. Pearlmutter, P. Bofill, and P. Kisilev, “Blind source sepa-ration by sparse decomposition,” in Independent Component Analysis: Princi-ples and Practice, S. J. Roberts and R. M. Everson, Eds. Cambridge University Press, 2001.
Google Scholar
M. Davies and N. Mitianoudis, “A simple mixture model for sparse overcom-plete ICA,” IEE Proceedings on Vision, Image and Signal Processing, Feb. 2004.
Google Scholar
A. Jourjine, S. Rickard, and O. Yilmaz, “Blind separation of disjoint orthogonal signals: Demixing n sources from 2 mixtures,” in Proc. ICASSP, vol. 5, Istanbul, Turkey, June 2000, pp. 2985-2988.
Google Scholar
B. D. Rao, K. Engan, S. F. Cotter, J. Palmer, and K. Kreutz-Delgado, “Subset selection in noise based on diversity measure minimization,” IEEE Trans. Sig-nal Processing, vol. 51, no. 3, pp. 760-770, Mar. 2003.
Article Google Scholar
S. Mallat, A Wavelet Tour of Signal Processing. Academic Press, 1998.
Google Scholar
S. Chen, D. Donoho, and M. Saunders, “Atomic decomposition by basis pur-suit,” SIAM Journal on Scientific Computing, vol. 20, no. 1, pp. 33-61, 1998.
Article MathSciNet Google Scholar
D. F. Andrews and C. L. Mallows, “Scale mixtures of normal distributions,” J. R. Statist. Soc. Series B, vol. B, no. 36, pp. 99-102, 1974.
MathSciNet Google Scholar
H. Snoussi and J. Idier, “Bayesian blind separation of generalized hyper-bolic processes in noisy and underdeterminate mixtures,” IEEE Trans. Signal Processing, vol. 54, no. 9, pp. 3257-3269, Sept. 2006.
Article Google Scholar
S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. PAMI-6, no. 6, pp. 721-741, Nov. 1984.
Article Google Scholar
W. R. Gilks, S. Richardson, and D. J. Spiegelhalter, Markov Chain Monte Carlo in Practice. Chapman & Hall, 1996.
Google Scholar
J. S. Liu, “The collapsed Gibbs sampler with applications to a gene regulation problem,” J. Amer. Statist. Assoc., vol. 89, no. 427, pp. 958-966, Sept. 1994.
Article MATH MathSciNet Google Scholar
J. S. Liu, W. H. Wong, and A. Kong, “Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes,” Biometrika, vol. 81, no. 1, pp. 27-40, Mar. 1994.
Article MATH MathSciNet Google Scholar
C. Févotte and S. Godsill, “A Bayesian approach to blind separation of sparse sources,” IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 6, pp. 2174-2188, Nov. 2006.
Article Google Scholar
J. Geweke, Variable Selection and Model Comparison in Regression, 5th ed. Oxford Press, 1996, pp. 609-620, edited by J. M. Bernardo, J. O. Berger, A. P. Dawid, and A. F. M. Swith.
Google Scholar
P. J. Wolfe, S. J. Godsill, and W.-J. Ng, “Bayesian variable selection and regu-larisation for time-frequency surface estimation,” J. R. Statist. Soc. Series B, 2004.
Google Scholar
C. Févotte and S. Godsill, “Sparse linear regression in unions of bases via Bayesian variable selection,” IEEE Signal Processing Letters, vol. 13, no. 7, pp. 441-444, July 2006.
Article Google Scholar
K. Brandenburg, “MP3 and AAC explained,” in Proc. AES 17th Int. Conf. High Quality Audio Coding, Florence, Italy, Sept. 1999.
Google Scholar
L. Daudet and M. Sandler, “MDCT analysis of sinusoids: exact results and applications to coding artifacts reduction,” IEEE Trans. Speech and Audio Processing, vol. 12, no. 3, pp. 302-312, May 2004.
Article Google Scholar
M. Davy, S. Godsill, and J. Idier, “Bayesian Analysis of Polyphonic Western Tonal Music,” Journal of the Acoustical Society of America, vol. 119, no. 4, pp. 2498-2517, Apr. 2006.
Article Google Scholar
C. Févotte, B. Torrésani, L. Daudet, and S. J. Godsill, “Sparse linear regression with structured priors and application to denoising of musical audio,” IEEE Transactions on Audio, Speech and Language, in press.
Google Scholar
C. Févotte, “Bayesian blind separation of audio mixtures with structured priors,” in Proc. 14th European Signal Processing Conference (EUSIPCO’06), Florence, Italy, Sep. 2006.
Google Scholar
E. Vincent, R. Gribonval, C. Févotte, et al., “BASS-dB: the blind audio source separation evaluation database,” Available on-line, http://www.irisa. fr/metiss/BASS-dB/.
E. Vincent, R. Gribonval, and C. Févotte, “Performance measurement in blind audio source separation,” IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 4, pp. 1462-1469, July 2006.
Article Google Scholar
http://www.tsi.enst.fr/~fevotte/Samples/book blind speech separation/.
L. Daudet and B. Torrésani, “Hybrid representations for audiophonic signal encoding,” Signal Processing, vol. 82, no. 11, pp. 1595-1617, 2002, special issue on Image and Video Coding Beyond Standards.
Google Scholar
S. Moussaoui, D. Brie, A. Mohammad-Djafari, and C. Carteret, “Separation of non-negative mixture of non-negative sources using a Bayesian approach and MCMC sampling,” IEEE Trans. Signal Processing, vol. 54, no. 11, pp. 4133-4145, Nov. 2006.
Article Google Scholar

Download references

Author information

Authors and Affiliations

GET/Télécom Paris (ENST), 37–39, rue Dareau, 75014, Paris, France
Cédric Févotte

Authors

Cédric Févotte
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

NTT Corporation, 2-4 Hikaridai, 619-0237, Soraku-gun, Kyoto, Japan
Shoji Makino & Hiroshi Sawada &
University of California, San Diego, 9500 Gilman Drive, 0523, 92093-0523, La Jolla, CA, USA
Te-Won Lee

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Févotte, C. (2007). Bayesian Audio Source Separation. In: Makino, S., Sawada, H., Lee, TW. (eds) Blind Speech Separation. Signals and Communication Technology. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6479-1_11

Download citation

DOI: https://doi.org/10.1007/978-1-4020-6479-1_11
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-6478-4
Online ISBN: 978-1-4020-6479-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics