Codebook Approaches for Single Sensor Speech/Music Separation

Blouet, Raphaël; Cohen, Israel

doi:10.1007/978-3-642-11130-3_7

Raphaël Blouet⁶ &
Israel Cohen⁷

Part of the book series: Springer Topics in Signal Processing ((STSP,volume 3))

2109 Accesses

Abstract

The work presented is this chapter is an introduction to the subject of single sensor source separation dedicated to the case of speech/music audio mixtures. Approaches related in this study are all based on a full (Bayesian) probabilistic framework for both source modeling and source estimation. We first present a review of several codebook approaches for single sensor source separation as well as several attempts to enhance the algorithms. All these approaches aim at adaptively estimating the optimal time-frequency masks for each audio component within the mixture. Three strategies for source modeling are presented: Gaussian scaled mixture models, codebooks of autoregressive models, and Bayesian non-negative matrix factorization (BNMF). These models are described in details and two estimators for the time-frequency masks are presented, namely the minimum mean-squared error and the maximum a posteriori. We then propose two extensions and improvements on the BNMF method. The first one suggests to enhance discrimination between speech and music through multi-scale analysis. The second one suggests to constrain the estimation of the expansion coefficients with prior information. We finally demonstrate the improved performance of the proposed methods on mixtures of voice and music signals before conclusions and perspectives.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

Audionamix, France
Raphaël Blouet
Technion-Israel Institute of Technology, Israel
Israel Cohen

Authors

Raphaël Blouet
View author publications
You can also search for this author in PubMed Google Scholar
Israel Cohen
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. Electrical Engineering, Technion - Israel Institute of Technology, 32000, Haifa Technion City, Israel
Israel Cohen
Inst. National de la Recherche Scientifique (INRS), Université de Quebec, 800 de la Gauchetiere Ouest, H5A 1K6, Montreal, QC, Canada
Jacob Benesty
School of Engineering, Bar-Ilan University, 52900 Ramat-Gan, Bdg. 1103, Israel
Sharon Gannot

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Blouet, R., Cohen, I. (2010). Codebook Approaches for Single Sensor Speech/Music Separation. In: Cohen, I., Benesty, J., Gannot, S. (eds) Speech Processing in Modern Communication. Springer Topics in Signal Processing, vol 3. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11130-3_7

Download citation

DOI: https://doi.org/10.1007/978-3-642-11130-3_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11129-7
Online ISBN: 978-3-642-11130-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics