Skip to main content

Codebook Approaches for Single Sensor Speech/Music Separation

  • Chapter
Speech Processing in Modern Communication

Part of the book series: Springer Topics in Signal Processing ((STSP,volume 3))

  • 2109 Accesses

Abstract

The work presented is this chapter is an introduction to the subject of single sensor source separation dedicated to the case of speech/music audio mixtures. Approaches related in this study are all based on a full (Bayesian) probabilistic framework for both source modeling and source estimation. We first present a review of several codebook approaches for single sensor source separation as well as several attempts to enhance the algorithms. All these approaches aim at adaptively estimating the optimal time-frequency masks for each audio component within the mixture. Three strategies for source modeling are presented: Gaussian scaled mixture models, codebooks of autoregressive models, and Bayesian non-negative matrix factorization (BNMF). These models are described in details and two estimators for the time-frequency masks are presented, namely the minimum mean-squared error and the maximum a posteriori. We then propose two extensions and improvements on the BNMF method. The first one suggests to enhance discrimination between speech and music through multi-scale analysis. The second one suggests to constrain the estimation of the expansion coefficients with prior information. We finally demonstrate the improved performance of the proposed methods on mixtures of voice and music signals before conclusions and perspectives.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2010 Springer Berlin Heidelberg

About this chapter

Cite this chapter

Blouet, R., Cohen, I. (2010). Codebook Approaches for Single Sensor Speech/Music Separation. In: Cohen, I., Benesty, J., Gannot, S. (eds) Speech Processing in Modern Communication. Springer Topics in Signal Processing, vol 3. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11130-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-11130-3_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-11129-7

  • Online ISBN: 978-3-642-11130-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics