Abstract
The music industry has come a long way since its inception. Music producers have also adhered to modern technology to infuse life into their creations. Systems capable of separating sounds based on sources especially vocals from songs have always been a necessity which has gained attention from researchers as well. The challenge of vocal separation elevates even more in the case of the multi-instrument environment. It is essential for a system to be first able to detect that whether a piece of music contains vocals or not prior to attempting source separation. In this paper, such a system is proposed being tested on a database of more than 99 h of instrumentals and songs. Using line spectral frequency-based features, we have obtained the highest accuracy of 99.78% from among six different classifiers, viz. BayesNet, Support Vector Machine, Multi Layer Perceptron, LibLinear, Simple Logistic and Decision Table.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Leung, T.W., Ngo, C.W., Lau, R.W.: ICA-FX features for classification of singing voice and instrumental sound. In: Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 2, pp. 367–370. IEEE (2004)
Chanrungutai, A., Ratanamahatana, C.A.: Singing voice separation for mono-channel music using non-negative matrix factorization. In: International Conference on Advanced Technologies for Communications, ATC 2008, pp. 243–246. IEEE (2008)
Rocamora, M., Herrera, P.: Comparing audio descriptors for singing voice detection in music audio files. In: 11th Brazilian Symposium on Computer Music, São Paulo, Brazil, vol. 26, p. 27 (2007)
Hsu, C.L., Jang, J.S.R.: On the improvement of singing voice separation for monaural recordings using the MIR-1K dataset. IEEE Trans. Audio Speech. Lang. Process. 18(2), 310–319 (2010)
Rafii, Z., Pardo, B.: A simple music/voice separation method based on the extraction of the repeating musical structure. In: 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 221–224. IEEE (2011)
Rafii, Z., Pardo, B.: Repeating pattern extraction technique (REPET): a simple method for music/voice separation. IEEE Trans. Audio Speech Lang. Process. 21(1), 73–84 (2013)
Liutkus, A., Rafii, Z., Badeau, R., Pardo, B., Richard, G.: Adaptive filtering for music/voice separation exploiting the repeating musical structure. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 53–56. IEEE (2012)
Ghosal, A., Chakraborty, R., Dhara, B.C., Saha, S.K.: Song/instrumental classification using spectrogram based contextual features. In: Proceedings of the CUBE International Information Technology Conference, pp. 21–25. ACM (2012)
Mauch, M., Fujihara, H., Yoshii, K., Goto, M.: Timbre and melody features for the recognition of vocal activity and instrumental solos in polyphonic music. In: ISMIR, pp. 233–238 (2011)
https://www.youtube.com/. Accessed 1 Mar 2018
https://www.ethnologue.com/. Accessed 1 Mar 2018
Mukherjee, H., Obaidullah, S.M., Phadikar, S., Roy, K.: SMIL-a musical instrument identification system. In: International Conference on Computational Intelligence, Communications, and Business Analytics, pp. 129–140. Springer, Singapore (2017)
Mukherjee, H., Phadikar, S., Rakshit, P., Roy, K.: REARC-a Bangla Phoneme recognizer. In: 2016 International Conference on Accessibility to Digital World (ICADW), pp. 177–180. IEEE (2016)
Paliwal, K.K.: On the use of line spectral frequency parameters for speech recognition. Digit. Sig. Process. 2(2), 80–87 (1992)
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Mach. Learn. 29(2–3), 131–163 (1997)
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)
Fan, R.E., Chang, K.W., Hsieh, C.J., Wang, X.R., Lin, C.J.: LIBLINEAR: a library for large linear classification. J. Mach. Learn. Res. 9, 1871–1874 (2008)
Sumner, M., Frank, E., Hall, M.: Speeding up logistic model tree induction. In: European Conference on Principles of Data Mining and Knowledge Discovery, pp. 675–683. Springer, Heidelberg (2005)
Kohavi, R.: The power of decision tables. In: European Conference on Machine Learning, pp. 174–189. Springer, Heidelberg (1995)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Mukherjee, H., Halder, C., Phadikar, S., Roy, K.: READ-a Bangla Phoneme recognition system. In: Proceedings of the 5th International Conference on Frontiers in Intelligent Computing: Theory and Applications, pp. 599–607. Springer, Singapore (2017)
Dems̆ar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Mukherjee, H., Obaidullah, Sk.Md., Santosh, K.C., Gonçalves, T., Phadikar, S., Roy, K.: Instrumentals/songs separation for background music removal. In: Kulczycki, P., Kowalski, P.A., Lukasik, S. (eds.) Contemporary Computational Science, p. 204. AGH-UST Press, Cracow (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Mukherjee, H., Obaidullah, S.M., Santosh, K.C., Gonçalves, T., Phadikar, S., Roy, K. (2020). Instrumentals/Songs Separation for Background Music Removal. In: Kulczycki, P., Kacprzyk, J., Kóczy, L., Mesiar, R., Wisniewski, R. (eds) Information Technology, Systems Research, and Computational Physics. ITSRCP 2018. Advances in Intelligent Systems and Computing, vol 945. Springer, Cham. https://doi.org/10.1007/978-3-030-18058-4_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-18058-4_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-18057-7
Online ISBN: 978-3-030-18058-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)