The Journal of Supercomputing

, Volume 75, Issue 3, pp 1594–1609 | Cite as

Real-time Soundprism

  • A. J. Muñoz-MontoroEmail author
  • J. Ranilla
  • P. Vera-Candeas
  • E. F. Combarro
  • P. Alonso-Jordá


This paper presents a parallel real-time sound source separation system for decomposing an audio signal captured with a single microphone in so many audio signals as the number of instruments that are really playing. This approach is usually known as Soundprism. The application scenario of the system is for a concert hall in which users, instead of listening to the mixed audio, want to receive the audio of just an instrument, focusing on a particular performance. The challenge is even greater since we are interested in a real-time system on handheld devices, i.e., devices characterized by both low power consumption and mobility. The results presented show that it is possible to obtain real-time results in the tested scenarios using an ARM processor aided by a GPU, when this one is present.


Sound source separation Real-time Score alignment Audio processing Parallel computing GPGPU 



This work has been supported by the “Ministerio de Economía y Competitividad” of Spain and FEDER under projects TEC2015-67387-C4-{1,2,3}-R.


  1. 1.
    Alonso P, Cortina R, Rodríguez-Serrano FJ, Vera-Candeas P, Alonso-González M, Ranilla J (2017) Parallel online time warping for real-time audio-to-score alignment in multi-core systems. J Supercomput 73:126. CrossRefGoogle Scholar
  2. 2.
    Carabias-Orti JJ, Cobos M, Vera-Candeas P, Rodríguez-Serrano FJ (2013) Nonnegative signal factorization with learnt instrument models for sound source separation in close-microphone recordings. EURASIP J Adv Signal Process 2013:184. CrossRefGoogle Scholar
  3. 3.
    Carabias-Orti JJ, Rodriguez-Serrano FJ, Vera-Candeas P, Canadas-Quesada FJ, Ruiz-Reyes N (2015) An audio to score alignment framework using spectral factorization and dynamic time warping. In: 16th International Society for Music Information Retrieval Conference, pp 742–748Google Scholar
  4. 4.
    Díaz-Gracia N, Cocaña-Fernández A, Alonso-González M, Martínez-Zaldívar FJ, Cortina R, García-Mollá VM, Alonso P, Ranilla J (2014) NNMFPACK: a versatile approach to an NNMF parallel library. In: Proceedings of the 2014 International Conference on Computational and Mathematical Methods in Science and Engineering, pp 456–465Google Scholar
  5. 5.
    Díaz-Gracia N, Cocaña-Fernández A, Alonso-González M, Martínez-Zaldívar FJ, Cortina R, García-Mollá VM, Vidal AM (2015) Improving NNMFPACK with heterogeneous and efficient kernels for \(\beta \)-divergence metrics. J Supercomput 71:1846–1856. CrossRefGoogle Scholar
  6. 6.
    Driedger J, Grohganz H, Prätzlich T, Ewert S, Müller M (2013) Score-informed audio decomposition and applications. In: Proceedings of the 21st ACM International Conference on Multimedia, pp 541–544Google Scholar
  7. 7.
    Duan Z, Pardo B (2011) Soundprism: an online system for score-informed source separation of music audio. IEEE J Sel Top Signal Process 5(6):1205–1215CrossRefGoogle Scholar
  8. 8.
    Duong NQ, Vincent E, Gribonval R (2010) Under-determined reverberant audio source separation using a full-rank spatial covariance model. IEEE Trans Audio Speech 18(7):1830–1840. CrossRefGoogle Scholar
  9. 9.
    Ewert S, Müller M (2011) Estimating note intensities in music recordings. In: Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pp 385–388Google Scholar
  10. 10.
    Ewert S, Pardo B, Mueller M, Plumbley MD (2014) Score-informed source separation for musical audio recordings: an overview. IEEE Signal Process Mag 31:116–124. CrossRefGoogle Scholar
  11. 11.
    Fastl H, Zwicker E (2007) Psychoacoustics. Springer, BerlinCrossRefGoogle Scholar
  12. 12.
    Ganseman J, Scheunders P, Mysore GJ, Abel JS (2010) Source separation by score synthesis. Int Comput Music Conf 2010:1–4Google Scholar
  13. 13.
    Goto M, Hashiguchi H, Nishimura T, Oka R (2002) RWC music database: popular, classical and jazz music databases. In: ISMIR, vol 2, pp 287–288Google Scholar
  14. 14.
    Goto M (2004) Development of the RWC music database. In: Proceedings of the 18th International Congress on Acoustics (ICA 2004), ppp 553–556Google Scholar
  15. 15.
    Hennequin R, David B, Badeau R (2011) Score informed audio source separation using a parametric model of non-negative spectrogram. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) pp 45–48.
  16. 16.
    Itoyama K, Goto M, Komatani K et al (2008) Instrument equalizer for query-by-example retrieval: improving sound source separation based on integrated harmonic and inharmonic models. In: ISMIR.
  17. 17.
    Marxer R, Janer J, Bonada J (2012) Low-latency instrument separation in polyphonic audio using timbre models. In: International Conference on Latent Variable Analysis and Signal Separation, pp 314–321Google Scholar
  18. 18.
    Miron M, Carabias-Orti JJ, Janer J (2015) Improving score-informed source separation for classical music through note refinement. In: ISMIR, pp 448–454Google Scholar
  19. 19.
    Ozerov A, Févotte C (2010) Multichannel nonnegative matrix factorization in convolutive mixtures for audio source separation. IEEE Trans Audio Speech Lang Process 18:550–563. CrossRefGoogle Scholar
  20. 20.
    Ozerov A, Vincent E, Bimbot F (2012) A general flexible framework for the handling of prior information in audio source separation. IEEE Trans Audio Speech Lang Process 20:1118–1133. CrossRefGoogle Scholar
  21. 21.
    Pätynen J, Pulkki V, Lokki T (2008) Anechoic recording system for symphony orchestra. Acta Acust United Acust 94:856–865. CrossRefGoogle Scholar
  22. 22.
    Raphael C (2008) A classifier-based approach to score-guided source separation of musical audio. Comput Music J 32:51–59. CrossRefGoogle Scholar
  23. 23.
    Rodriguez-Serrano FJ, Duan Z, Vera-Candeas P, Pardo B, Carabias-Orti JJ (2015) Online score-informed source separation with adaptive instrument models. J New Music Res 44:83–96. CrossRefGoogle Scholar
  24. 24.
    Rodriguez-Serrano FJ, Carabias-Orti JJ, Vera-Candeas P, Martinez-Munoz D (2016) Tempo driven audio-to-score alignment using spectral decomposition and online dynamic time warping. ACM Trans Intell Syst Technol 8:1–20. CrossRefGoogle Scholar
  25. 25.
    Sawada H, Araki S, Makino S (2011) Underdetermined convolutive blind source separation via frequency bin-wise clustering and permutation alignment. IEEE Trans Audio Speech Lang Process 19(3):516–527. CrossRefGoogle Scholar
  26. 26.
    Vincent E, Araki S, Theis F et al (2012) The signal separation evaluation campaign (2007–2010): achievements and remaining challenges. Signal Process 92:1928–1936. CrossRefGoogle Scholar
  27. 27.
    Vincent E, Bertin N, Gribonval R, Bimbot F (2014) From blind to guided audio source separation: how models and side information can improve the separation of sound. IEEE Signal Process Mag 31:107–115. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of Telecommunication EngineeringUniversidad de JaénJaénSpain
  2. 2.Department of Computer ScienceUniversidad de OviedoOviedoSpain
  3. 3.Department of Information Systems and ComputationUniversitat Politècnica de ValènciaValenciaSpain

Personalised recommendations