Skip to main content

Spatial Manipulation of Musical Sound: Informed Source Separation and Respatialization

  • Chapter
  • First Online:
Computational Phonogram Archiving

Part of the book series: Current Research in Systematic Musicology ((CRSM,volume 5))

  • 478 Accesses

Abstract

“Active listening” enables the listener to interact with the sound while it is played, like composers of electroacoustic music. The main manipulation of the musical scene is (re)spatialization: moving sound sources in space. This is equivalent to source separation. Indeed, moving all the sources of the scene but one away from the listener separates that source. And moving separate sources then rendering from them the corresponding scene (spatial image) is easy. Allowing this spatial interaction/source separation from fixed musical pieces with a sufficient quality is a (too) challenging task for classic approaches, since it requires an analysis of the scene with inevitable (and often unacceptable) estimation errors. Thus we introduced the informed approach, which consists in inaudibly embedding some additional information. This information, which is coded with a minimal rate, aims at increasing the precision of the analysis/separation. Thus, the informed approach relies on both estimation and information theories. During the DReaM project, several informed source separation (ISS) methods were proposed. Among the best methods is the one based on spatial filtering (beamforming), with the spectral envelopes of the sources (perceptively coded) as additional information. More precisely, the proposed method is realized in an encoder-decoder framework. At the encoder, the spectral envelopes of the (known) original sources are extracted, their frequency resolution is adapted to the critical bands, and their magnitude is logarithmically quantized. These envelopes are then passed on to the decoder with the stereo mixture. At the decoder, the mixture signal is decomposed by time-frequency selective spatial filtering guided by a source activity index, derived from the spectral envelope values. The real-time manipulation of the sound sources is then possible, from musical pieces initially fixed (possibly on some media like CDs), and with an unpreceded (controllable) quality.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    See URL: http://dream.labri.fr.

  2. 2.

    See URL: http://www.rockband.com.

References

  1. Comon P, Jutten C (eds) (2010) Handbook of blind source separation—independent component analysis and applications. Academic Press

    Google Scholar 

  2. Fourer D, Marchand S (2013) Informed spectral analysis: audio signal parameter estimation using side information. EURASIP J Appl Signal Process 2013(1):178

    Article  Google Scholar 

  3. Girin L, Pinel J (2011) Informed audio source separation from compressed linear stereo mixtures. In: Proceedings of the 42nd AES conference, Ilmenau, Germany, July 2011

    Google Scholar 

  4. Gorlow S, Marchand S (2013) Informed audio source separation using linearly constrained spatial filters. IEEE Trans Audio Speech Lang Process 21(1):3–13

    Article  Google Scholar 

  5. Gorlow S, Marchand S (2013) Informed separation of spatial images of stereo music recordings using low-order statistics. In: Proceedings of the IEEE workshop on machine learning for signal processing (MLSP), Southampton, United Kingdom, September 2013

    Google Scholar 

  6. Gorlow S, Marchand S (2013) On the informed source separation approach for interactive remixing in stereo. In: Proceedings of the 134th AES convention, Roma, Italy, May 2013

    Google Scholar 

  7. Gunawan D, Sen D (2010) Iterative phase estimation for the synthesis of separated sources from single-channel mixtures. IEEE Signal Process Lett 17(5):421–424

    Article  Google Scholar 

  8. Huber R, Kollmeier B (2006) PEMO-Q—a new method for objective audio quality assessment using a model of auditory perception. IEEE Trans Audio Speech Lang Process 14(6):1902–1911

    Article  Google Scholar 

  9. ISO/IEC 23000-12 (2010) Information technology—multimedia application format (MPEG-A)—Part 12: Interactive music application format (IMAF)

    Google Scholar 

  10. Knuth KH (2005) Informed source separation: a Bayesian tutorial. In: Proceedings of the European signal processing conference (EUSIPCO), Antalya, Turkey, September 2005

    Google Scholar 

  11. Lepain P (1998) Recherche et applications en informatique musicale, chapter Écoute interactive des documents musicaux numériques, pp 209–226, Hermes, Paris, France, 1998 (in French)

    Google Scholar 

  12. Liutkus A, Gorlow S, Sturmel N, Zhang S, Girin L, Badeau R, Daudet L, Marchand S, Richard G (2012) Informed audio source separation: a comparative study. In: Proceedings of the European signal processing conference (EUSIPCO), Bucharest, Romania, August 2012

    Google Scholar 

  13. Liutkus A, Ozerov A, Badeau R, Richard G (2012) Spatial coding-based informed source separation. In: Proceedings of the European signal processing conference (EUSIPCO), Bucharest, Romania, August 2012

    Google Scholar 

  14. Liutkus A, Pinel J, Badeau R, Girin L, Richard G (2012) Informed source separation through spectrogram coding and data embedding. Signal Process 92(8):1937–1949

    Article  Google Scholar 

  15. Marchand S, Mansencal B, Girin L (2011) Interactive music with active audio CDs. Lect Notes Comput Sci Explor Music Contents 6684:31–50

    Article  Google Scholar 

  16. Marchand S, Badeau R, Baras C, Daudet L, Fourer D, Girin L, Gorlow S, Liutkus A, Pinel J, Richard G, Sturmel N, Zang S (2012) DReaM: a novel system for joint source separation and multi-track coding. In: Proceedings of the 133rd AES convention, San Francisco, California, USA, October 2012

    Google Scholar 

  17. Mouba J, Marchand S, Mansencal B, Rivet J-M (2008) RetroSpat: a perception-based system for semi-automatic diffusion of acousmatic music. In: Proceedings of the sound and music computing (SMC) conference, pp 33–40, Berlin, Germany, July/August 2008

    Google Scholar 

  18. Ozerov A, Févotte C (2010) Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans Audio Speech Lang Process 18(3):550–563

    Article  Google Scholar 

  19. Ozerov A, Liutkus A, Badeau R, Richard G (2011) Informed source separation: source coding meets source separation. In: Proceedings of the IEEE workshop on applications of signal processing to audio and acoustics (WASPAA), pp 257–260, New Paltz, New York, USA, October 2011

    Google Scholar 

  20. Pachet F, Delerue O (1998) A constraint-based temporal music spatializer. In: Proceedings of the ACM multimedia conference, Brighton, United Kingdom

    Google Scholar 

  21. Parvaix M, Girin L (2011) Informed source separation of linear instantaneous under-determined audio mixtures by source index embedding. IEEE Trans Audio Speech Lang Process 19(6):1721–1733

    Article  Google Scholar 

  22. Pinel J, Girin L, Baras C, Parvaix M (2010) A high-capacity watermarking technique for audio signals based on MDCT-domain quantization. In: Proceedings of the international congress on acoustics (ICA), Sydney, Australia, August 2010

    Google Scholar 

  23. Sturmel N, Daudet L (2013) Informed source separation using iterative reconstruction. IEEE Trans Audio Speech Lang Process 21(1):178–185

    Article  Google Scholar 

  24. Sturmel N, Liutkus A, Pinel J, Girin L, Marchand S, Richard G, Badeau R, Daudet L (2012) Linear mixing models for active listening of music productions in realistic studio conditions. In: Proceedings of the 132nd AES convention, Budapest, Hungary, April 2012

    Google Scholar 

  25. Vincent E, Gribonval R, Févotte C (2006) Performance measurement in blind audio source separation. IEEE Trans Audio Speech Lang Process 14(4):1462–1469

    Article  Google Scholar 

Download references

Acknowledgements

This research was partly supported by the French ANR (Agence Nationale de la Recherche), within the scope of the DReaM project (ANR-09-CORD-006). “You may say I’m a dreamer, but am not the only one.” (John Lennon—Imagine). Thus, the author would like to thank all the members of the project consortium for having made the DReaM come true.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sylvain Marchand .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Marchand, S. (2019). Spatial Manipulation of Musical Sound: Informed Source Separation and Respatialization. In: Bader, R. (eds) Computational Phonogram Archiving. Current Research in Systematic Musicology, vol 5. Springer, Cham. https://doi.org/10.1007/978-3-030-02695-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-02695-0_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-02694-3

  • Online ISBN: 978-3-030-02695-0

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics