Abstract
This paper reports the organization and results for the 2018 community-based Signal Separation Evaluation Campaign (SiSEC 2018). This year’s edition was focused on audio and pursued the effort towards scaling up and making it easier to prototype audio separation software in an era of machine-learning based systems. For this purpose, we prepared a new music separation database: MUSDB18, featuring close to 10 h of audio. Additionally, open-source software was released to automatically load, process and report performance on MUSDB18. Furthermore, a new official Python version for the BSS Eval toolbox was released, along with reference implementations for three oracle separation methods: ideal binary mask, ideal ratio mask, and multichannel Wiener filter. We finally report the results obtained by the participants.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
- 4.
pip install museval.
References
Araki, S., Nesta, F., Vincent, E., Koldovský, Z., Nolte, G., Ziehe, A., Benichoux, A.: The 2011 signal separation evaluation campaign (SiSEC2011): - audio source separation -. In: Theis, F., Cichocki, A., Yeredor, A., Zibulevsky, M. (eds.) LVA/ICA 2012. LNCS, vol. 7191, pp. 414–422. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28551-6_51
Barker, J., Marxer, R., Vincent, E., Watanabe, S.: The third chimespeech separation and recognition challenge: dataset, task and baselines. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp. 504–511. IEEE (2015)
Barker, J., Vincent, E., Ma, N., Christensen, H., Green, P.: The pascal chime speech separation and recognition challenge. Comput. Speech Lang. 27(3), 621–633 (2013)
Bittner, R., Salamon, J., Tierney, M., Mauch, M., Cannam, C., Bello, J.P.: MedleyDB: a multitrack dataset for annotation-intensive mir research. In: 15th International Society for Music Information Retrieval Conference, Taipei, Taiwan, October 2014
Corey, R.M., Singer, A.C.: Underdetermined methods for multichannel audio enhancement with partial preservation of background sources. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 26–30 (2017)
Duong, N.Q.K., Vincent, E., Gribonval, R.: Under-determined reverberant audio source separation using a full-rank spatial covariance model. IEEE Trans. Audio Speech Lang. Process. 18(7), 1830–1840 (2010)
Févotte, C., Gribonval, R., Vincent, E.: Bss_eval toolbox user guide-revision 2.0 (2005)
Fitzgerald, D.: Harmonic/percussive separation using median filtering (2010)
Huang, P.-S., Chen, S.D., Smaragdis, P., Hasegawa-Johnson, M.: Singing-voice separation from monaural recordings using robust principal component analysis. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 57–60. IEEE (2012)
Huang, P.-S., Kim, M., Hasegawa-Johnson, M., Smaragdis, P.: Singing-voice separation from monaural recordings using deep recurrent neural networks. In: ISMIR, pp. 477–482 (2014)
Liu, J.-Y., Yang, Y.-H.: JY Music Source Separtion submission for SiSEC, Research Center for IT Innovation, Academia Sinica, Taiwan (2018). https://github.com/ciaua/MusicSourceSeparation
Liutkus, A., Badeau, R.: Generalized Wiener filtering with fractional power spectrograms. In: IEEE International Conference on Acoustics, Speech and Signal Processing, Brisbane, QLD, Australia, April 2015
Liutkus, A., Badeau, R., Richard, G.: Low bitrate informed source separation of realistic mixtures. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 66–70. IEEE (2013)
Liutkus, A., Stöter, F.-R., Rafii, Z., Kitamura, D., Rivet, B., Ito, N., Ono, N., Fontecave, J.: The 2016 signal separation evaluation campaign. In: Tichavský, P., Babaie-Zadeh, M., Michel, O.J.J., Thirion-Moreau, N. (eds.) LVA/ICA 2017. LNCS, vol. 10169, pp. 323–332. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-53547-0_31
Manilow, E., Seetharaman, P., Pishdadian, F., Pardo, B.: NUSSL: the northwestern university source separation library (2018). https://github.com/interactiveaudiolab/nussl
Mimilakis, S.I., Drossos, K., Santos, J., Schuller, G., Virtanen, T., Bengio, Y.: Monaural singing voice separation with skip-filtering connections and recurrent inference of time-frequency mask (2017)
Mimilakis, S.I., Drossos, K., Virtanen, T., Schuller, G.: A recurrent encoder-decoder approach with skip-filtering connections for monaural singing voice separation (2017)
Ono, N., Koldovský, Z., Miyabe, S., Ito, N.: The 2013 signal separation evaluation campaign. In: 2013 IEEE International Workshop on Machine Learning for Signal Processing (MLSP), September 2013
Ono, N., Rafii, Z., Kitamura, D., Ito, N., Liutkus, A.: The 2015 signal separation evaluation campaign. In: Vincent, E., Yeredor, A., Koldovský, Z., Tichavský, P. (eds.) LVA/ICA 2015. LNCS, vol. 9237, pp. 387–395. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-22482-4_45
Rafii, Z., Liutkus, A., Pardo, B.: REPET for background/foreground separation in audio. In: Naik, G.R., Wang, W. (eds.) Blind Source Separation. SCT, pp. 395–411. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-642-55016-4_14
Rafii, Z., Liutkus, A., Stter, F.-R., Mimilakis, S.I., Bittner, R.: The MUSDB18 corpus for music separation, December 2017
Rafii, Z., Pardo, B.: Repeating pattern extraction technique (repet): A simple method for music/voice separation. IEEE Trans. Audio Speech Lang. Process. 21(1), 73–84 (2013)
Roma, G., Green, O., Tremblay, P.-A.: Improving single-network single-channel separation of musical audio with convolutional layers. In: International Conference on Latent Variable Analysis and Signal Separation (2018)
Salamon, J., Gómez, E.: Melody extraction from polyphonic music signals using pitch contour characteristics. IEEE Trans. Audio Speech Lang. Process. 20(6), 1759–1770 (2012)
Seetharaman, P., Pishdadian, F., Pardo, B.: Music/voice separation using the 2d fourier transform. In: 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 36–40. IEEE (2017)
Takahashi, N., Mitsufuji, Y.: Multi-scale multi-band densenets for audio source separation. In: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 21–25. IEEE (2017)
Uhlich, S., Giron, F., Mitsufuji, Y.: Deep neural network based instrument extraction from music. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2135–2139. IEEE (2015)
Uhlich, S., Porcu, M., Giron, F., Enenkl, M., Kemp, T., Takahashi, N., Mitsufuji, Y.: Improving music source separation based on deep neural networks through data augmentation and network blending. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 261–265. IEEE (2017)
Vincent, E., Araki, S., Bofill, P.: The 2008 signal separation evaluation campaign: a community-based approach to large-scale evaluation. In: Adali, T., Jutten, C., Romano, J.M.T., Barros, A.K. (eds.) ICA 2009. LNCS, vol. 5441, pp. 734–741. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00599-2_92
Vincent, E., Araki, S., Theis, F., Nolte, G., Bofill, P., Sawada, H., Ozerov, A., Gowreesunker, V., Lutter, D., Duong, N.Q.K.: The signal separation evaluation campaign (2007–2010): achievements and remaining challenges. Signal Process. 92(8), 1928–1936 (2012)
Vincent, E., Barker, J., Watanabe, S., Roux, J.L., Nesta, F., Matassoni, M.: The second chimespeech separation and recognition challenge: datasets, tasks and baselines. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 126–130. IEEE (2013)
Vincent, E., Gribonval, R., Févotte, C.: Performance measurement in blind audio source separation. IEEE Trans. Audio Speech Lang. Process. 14(4), 1462–1469 (2006)
Vincent, E., Gribonval, R., Plumbley, M.D.: Oracle estimators for the benchmarking of source separation algorithms. Signal Process. 87(8), 1933–1950 (2007)
Vincent, E., Sawada, H., Bofill, P., Makino, S., Rosca, J.P.: First stereo audio source separation evaluation campaign: data, algorithms and results. In: Davies, M.E., James, C.J., Abdallah, S.A., Plumbley, M.D. (eds.) ICA 2007. LNCS, vol. 4666, pp. 552–559. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74494-8_69
Wang, D.: On ideal binary mask as the computational goal of auditory scene analysis. In: Divenyi, P. (ed.) Speech Separation by Humans and Machines, pp. 181–197. Springer, Boston (2005). https://doi.org/10.1007/0-387-22794-6_12
Weninger, F., Hershey, J.R., Roux, J.L., Schuller, B.: Discriminatively trained recurrent neural networks for single-channel speech separation. In: IEEE Global Conference on Signal and Information Processing (GlobalSIP), pp. 577–581. IEEE (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Stöter, FR., Liutkus, A., Ito, N. (2018). The 2018 Signal Separation Evaluation Campaign. In: Deville, Y., Gannot, S., Mason, R., Plumbley, M., Ward, D. (eds) Latent Variable Analysis and Signal Separation. LVA/ICA 2018. Lecture Notes in Computer Science(), vol 10891. Springer, Cham. https://doi.org/10.1007/978-3-319-93764-9_28
Download citation
DOI: https://doi.org/10.1007/978-3-319-93764-9_28
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-93763-2
Online ISBN: 978-3-319-93764-9
eBook Packages: Computer ScienceComputer Science (R0)