Skip to main content

REPET for Background/Foreground Separation in Audio

  • Chapter
  • First Online:
Blind Source Separation

Part of the book series: Signals and Communication Technology ((SCT))

Abstract

Repetition is a fundamental element in generating and perceiving structure. In audio, mixtures are often composed of structures where a repeating background signal is superimposed with a varying foreground signal (e.g., a singer overlaying varying vocals on a repeating accompaniment or a varying speech signal mixed up with a repeating background noise). On this basis, we present the REpeating Pattern Extraction Technique (REPET), a simple approach for separating the repeating background from the non-repeating foreground in an audio mixture. The basic idea is to find the repeating elements in the mixture, derive the underlying repeating models, and extract the repeating background by comparing the models to the mixture. Unlike other separation approaches, REPET does not depend on special parameterizations, does not rely on complex frameworks, and does not require external information. Because it is only based on repetition, it has the advantage of being simple, fast, blind, and therefore completely and easily automatable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://music.eecs.northwestern.edu/research.php?project=repet

  2. 2.

    http://sisec.wiki.irisa.fr/tikiindex.php?page=Professionally+produced+music+recordings

  3. 3.

    http://music.eecs.northwestern.edu/research.php?project=repet

  4. 4.

    http://sisec.wiki.irisa.fr/tikiindex.php?page=Professionally+produced+music+recordings

  5. 5.

    http://sisec.wiki.irisa.fr/tiki-index.php?page=Two-channel+mixtures+of+speech+and+realworld+background+noise

  6. 6.

    http://music.eecs.northwestern.edu/research.php?project=repet

References

  1. Blandin, C., Ozerov, A., Vincent, E.: Multi-source TDOA estimation in reverberant audio using angular spectra and clustering. Signal Process. 92(8), 1950–1960 (2012)

    Article  Google Scholar 

  2. Bregman, A.S.: Auditory Scene Analysis. MIT Press, Cambridge (1990)

    Google Scholar 

  3. Durrieu, J.L., David, B., Richard, G.: A musically motivated mid-level representation for pitch estimation and musical audio source separation. IEEE J. Sel. Top. Sig. Process. 5(6), 1180–1191 (2011)

    Article  Google Scholar 

  4. FitzGerald, D.: Vocal separation using nearest neighbours and median filtering. In: 23rd IET Irish Signals and Systems Conference. Maynooth, Ireland (2012)

    Google Scholar 

  5. FitzGerald, D., Gainza, M.: Single channel vocal separation using median filtering and factorisation techniques. ISAST Trans. Electron. Signal Process. 4(1), 62–73 (2010)

    Google Scholar 

  6. Foote, J.: Visualizing music and audio using self-similarity. In: 7th ACM International Conference on Multimedia, pp. 77–80. Orlando, FL, USA (1999)

    Google Scholar 

  7. Foote, J., Uchihashi, S.: The beat spectrum: a new approach to rhythm analysis. In: IEEE International Conference on Multimedia and Expo, pp. 881–884. Tokyo, Japan (2001)

    Google Scholar 

  8. Hsu, C.L., Jang, J.S.R.: On the improvement of singing voice separation for monaural recordings using the MIR-1K dataset. IEEE Trans Audio Speech Lang. Process. 18(2), 310–319 (2010)

    Article  Google Scholar 

  9. Liutkus, A., Rafii, Z., Badeau, R., Pardo, B., Richard, G.: Adaptive filtering for music/voice separation exploiting the repeating musical structure. In: 37th International Conference on Acoustics, Speech and Signal Processing. Kyoto, Japan (2012)

    Google Scholar 

  10. McDermott, J.H., Wrobleski, D., Oxenham, A.J.: Recovering sound sources from embedded repetition. Proc Nat. Acad. Sci. U.S.A. 108(3), 1188–1193 (2011)

    Article  Google Scholar 

  11. Nesta, F., Matassoni, M.: Robust automatic speech recognition through on-line semi blind source extraction. In: CHIME 2011 Workshop on Machine Listening in Multisource Environments, pp. 18–23. Florence, Italy (2011)

    Google Scholar 

  12. Piccardi, M.: Background subtraction techniques: a review. In: IEEE International Conference on Systems, Man and Cybernetics. The Hague, The Netherlands (2004)

    Google Scholar 

  13. Rafii, Z., Pardo, B.: A simple music/voice separation system based on the extraction of the repeating musical structure. In: 36th International Conference on Acoustics, Speech and Signal Processing. Prague, Czech Republic (2011)

    Google Scholar 

  14. Rafii, Z., Pardo, B.: Music/voice separation using the similarity matrix. In: 13th International Society for Music Information Retrieval. Porto, Portugal (2012)

    Google Scholar 

  15. Rafii, Z., Pardo, B.: Online REPET-SIM for real-time speech enhancement. In: 38th International Conference on Acoustics, Speech and Signal Processing. Vancouver, BC, Canada (2013)

    Google Scholar 

  16. Rafii, Z., Pardo, B.: REpeating Pattern Extraction Technique (REPET): A simple method for music/voice separation. IEEE Trans. Audio Speech Lang. Process 21(1), 71–82 (2013)

    Article  Google Scholar 

  17. Rafii, Z., Sun, D.L., Germain, F.G., Mysore, G.J.: Combining modeling of singing voice and background music for automatic separation of musical mixtures. In: 14th International Society for Music Information Retrieval. Curitiba, PR, Brazil (2013).

    Google Scholar 

  18. Rangachari, S., Loizou, P.C.: A noise-estimation algorithm for highly non-stationary environments. Speech Commun. 48(2), 220–231 (2006)

    Article  Google Scholar 

  19. Rubin, E.: Synsoplevede Figurer. Gyldendal, Skive (1915)

    Google Scholar 

  20. Özgür Yilmaz, Rickard, S.: Blind separation of speech mixtures via time–frequency masking. IEEE Trans. Signal Process. 52(7), 1830–1847 (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zafar Rafii .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Rafii, Z., Liutkus, A., Pardo, B. (2014). REPET for Background/Foreground Separation in Audio. In: Naik, G., Wang, W. (eds) Blind Source Separation. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-55016-4_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-55016-4_14

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-55015-7

  • Online ISBN: 978-3-642-55016-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics