Skip to main content

Bayesian Audio Source Separation

  • Chapter
Blind Speech Separation

Part of the book series: Signals and Communication Technology ((SCT))

In this chapter we describe a Bayesian approach to audio source separation. The approach relies on probabilistic modeling of sound sources as (sparse) linear combinations of atoms from a dictionary and Markov chain Monte Carlo (MCMC) inference. Several prior distributions are considered for the source expansion coefficients. We first consider independent and identically distributed (iid) general priors with two choices of distributions. The first one is the Student t, which is a good model for sparsity when the shape parameter has a low value. The second one is a hierarchical mixture distribution; conditionally upon an indicator variable, one coefficient is either set to zero or given a normal distribution, whose variance is in turn given an inverted-Gamma distribution. Then, we consider more audiospecific models where both the identically distributed and independently distributed assumptions are lifted. Using a Modified Discrete Cosine Transform (MDCT) dictionary, a time–frequency orthonormal basis, we describe frequency-dependent structured priors which explicitly model the harmonic structure of sound, using a Markov hierarchical modeling of the expansion coefficients. Separation results are given for a stereophonic recording of three sources.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. K. H. Knuth, “Bayesian source separation and localization,” in SPIE’98: Bayesian Inference for Inverse Problems, San Diego, Jul. 1998, pp. 147-158.

    Google Scholar 

  2. ——, “A Bayesian approach to source separation,” in Proc. 1st International Workshop on Independent Component Analysis and Signal Separation, Aussois, France, Jan. 1999, pp. 283-288.

    Google Scholar 

  3. A. Mohammad-Djafari, “A Bayesian approach to source separation,” in Proc. 19th International Workshop on Bayesian Inference and Maximum Entropy Methods (MaxEnt99), Boise, USA, Aug. 1999.

    Google Scholar 

  4. A. J. Bell and T. J. Sejnowski, “An information-maximization approach to blind separation and blind deconvolution,” Neural Computation, vol. 7, no. 6, pp. 1129-1159, 1995.

    Article  Google Scholar 

  5. J.-F. Cardoso, “Blind signal separation: statistical principles,” Proceedings of the IEEE. Special issue on blind identification and estimation, vol. 9, no. 10, pp. 2009-2025, Oct. 1998.

    Google Scholar 

  6. B. A. Olshausen and K. J. Millman, “Learning sparse codes with a mixture-of-Gaussians prior,” in Advances in Neural Information Processing Systems, S. A. Solla and T. K. Leen, Eds. MIT press, 2000, pp. 841-847.

    Google Scholar 

  7. M. S. Lewicki and T. J. Sejnowski, “Learning overcomplete representations,” Neural Computations, vol. 12, pp. 337-365, 2000.

    Article  Google Scholar 

  8. M. Girolami, “A variational method for learning sparse and overcomplete rep-resentations,” Neural Computation, vol. 13, no. 11, pp. 2517-2532, 2001.

    Article  MATH  Google Scholar 

  9. T.-W. Lee, M. S. Lewicki, M. Girolami, and T. J. Sejnowski, “Blind source separation of more sources than mixtures using overcomplete representations,” IEEE Signal Processing Letters, vol. 4, no. 4, Apr. 1999.

    Google Scholar 

  10. M. Zibulevsky, B. A. Pearlmutter, P. Bofill, and P. Kisilev, “Blind source sepa-ration by sparse decomposition,” in Independent Component Analysis: Princi-ples and Practice, S. J. Roberts and R. M. Everson, Eds. Cambridge University Press, 2001.

    Google Scholar 

  11. M. Davies and N. Mitianoudis, “A simple mixture model for sparse overcom-plete ICA,” IEE Proceedings on Vision, Image and Signal Processing, Feb. 2004.

    Google Scholar 

  12. A. Jourjine, S. Rickard, and O. Yilmaz, “Blind separation of disjoint orthogonal signals: Demixing n sources from 2 mixtures,” in Proc. ICASSP, vol. 5, Istanbul, Turkey, June 2000, pp. 2985-2988.

    Google Scholar 

  13. B. D. Rao, K. Engan, S. F. Cotter, J. Palmer, and K. Kreutz-Delgado, “Subset selection in noise based on diversity measure minimization,” IEEE Trans. Sig-nal Processing, vol. 51, no. 3, pp. 760-770, Mar. 2003.

    Article  Google Scholar 

  14. S. Mallat, A Wavelet Tour of Signal Processing. Academic Press, 1998.

    Google Scholar 

  15. S. Chen, D. Donoho, and M. Saunders, “Atomic decomposition by basis pur-suit,” SIAM Journal on Scientific Computing, vol. 20, no. 1, pp. 33-61, 1998.

    Article  MathSciNet  Google Scholar 

  16. D. F. Andrews and C. L. Mallows, “Scale mixtures of normal distributions,” J. R. Statist. Soc. Series B, vol. B, no. 36, pp. 99-102, 1974.

    MathSciNet  Google Scholar 

  17. H. Snoussi and J. Idier, “Bayesian blind separation of generalized hyper-bolic processes in noisy and underdeterminate mixtures,” IEEE Trans. Signal Processing, vol. 54, no. 9, pp. 3257-3269, Sept. 2006.

    Article  Google Scholar 

  18. S. Geman and D. Geman, “Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. PAMI-6, no. 6, pp. 721-741, Nov. 1984.

    Article  Google Scholar 

  19. W. R. Gilks, S. Richardson, and D. J. Spiegelhalter, Markov Chain Monte Carlo in Practice. Chapman & Hall, 1996.

    Google Scholar 

  20. J. S. Liu, “The collapsed Gibbs sampler with applications to a gene regulation problem,” J. Amer. Statist. Assoc., vol. 89, no. 427, pp. 958-966, Sept. 1994.

    Article  MATH  MathSciNet  Google Scholar 

  21. J. S. Liu, W. H. Wong, and A. Kong, “Covariance structure of the Gibbs sampler with applications to the comparisons of estimators and augmentation schemes,” Biometrika, vol. 81, no. 1, pp. 27-40, Mar. 1994.

    Article  MATH  MathSciNet  Google Scholar 

  22. C. Févotte and S. Godsill, “A Bayesian approach to blind separation of sparse sources,” IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 6, pp. 2174-2188, Nov. 2006.

    Article  Google Scholar 

  23. J. Geweke, Variable Selection and Model Comparison in Regression, 5th ed. Oxford Press, 1996, pp. 609-620, edited by J. M. Bernardo, J. O. Berger, A. P. Dawid, and A. F. M. Swith.

    Google Scholar 

  24. P. J. Wolfe, S. J. Godsill, and W.-J. Ng, “Bayesian variable selection and regu-larisation for time-frequency surface estimation,” J. R. Statist. Soc. Series B, 2004.

    Google Scholar 

  25. C. Févotte and S. Godsill, “Sparse linear regression in unions of bases via Bayesian variable selection,” IEEE Signal Processing Letters, vol. 13, no. 7, pp. 441-444, July 2006.

    Article  Google Scholar 

  26. K. Brandenburg, “MP3 and AAC explained,” in Proc. AES 17th Int. Conf. High Quality Audio Coding, Florence, Italy, Sept. 1999.

    Google Scholar 

  27. L. Daudet and M. Sandler, “MDCT analysis of sinusoids: exact results and applications to coding artifacts reduction,” IEEE Trans. Speech and Audio Processing, vol. 12, no. 3, pp. 302-312, May 2004.

    Article  Google Scholar 

  28. M. Davy, S. Godsill, and J. Idier, “Bayesian Analysis of Polyphonic Western Tonal Music,” Journal of the Acoustical Society of America, vol. 119, no. 4, pp. 2498-2517, Apr. 2006.

    Article  Google Scholar 

  29. C. Févotte, B. Torrésani, L. Daudet, and S. J. Godsill, “Sparse linear regression with structured priors and application to denoising of musical audio,” IEEE Transactions on Audio, Speech and Language, in press.

    Google Scholar 

  30. C. Févotte, “Bayesian blind separation of audio mixtures with structured priors,” in Proc. 14th European Signal Processing Conference (EUSIPCO’06), Florence, Italy, Sep. 2006.

    Google Scholar 

  31. E. Vincent, R. Gribonval, C. Févotte, et al., “BASS-dB: the blind audio source separation evaluation database,” Available on-line, http://www.irisa. fr/metiss/BASS-dB/.

  32. E. Vincent, R. Gribonval, and C. Févotte, “Performance measurement in blind audio source separation,” IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 4, pp. 1462-1469, July 2006.

    Article  Google Scholar 

  33. http://www.tsi.enst.fr/~fevotte/Samples/book blind speech separation/.

  34. L. Daudet and B. Torrésani, “Hybrid representations for audiophonic signal encoding,” Signal Processing, vol. 82, no. 11, pp. 1595-1617, 2002, special issue on Image and Video Coding Beyond Standards.

    Google Scholar 

  35. S. Moussaoui, D. Brie, A. Mohammad-Djafari, and C. Carteret, “Separation of non-negative mixture of non-negative sources using a Bayesian approach and MCMC sampling,” IEEE Trans. Signal Processing, vol. 54, no. 11, pp. 4133-4145, Nov. 2006.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer

About this chapter

Cite this chapter

Févotte, C. (2007). Bayesian Audio Source Separation. In: Makino, S., Sawada, H., Lee, TW. (eds) Blind Speech Separation. Signals and Communication Technology. Springer, Dordrecht. https://doi.org/10.1007/978-1-4020-6479-1_11

Download citation

  • DOI: https://doi.org/10.1007/978-1-4020-6479-1_11

  • Publisher Name: Springer, Dordrecht

  • Print ISBN: 978-1-4020-6478-4

  • Online ISBN: 978-1-4020-6479-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics