Abstract
Audio interactive applications have eased our lives in numerous ways encompassing speech recognition to song identification. Such applications have helped the common people in using Information Technology by providing them a passage for skipping the complicated user interactivity procedures. Audio based search applications have become very popular nowadays especially for searching songs. A system which can distinguish between speech and songs can help to boost the performance of such applications by minimizing the search space and at the same time decide the method of recognition based on the type of audio. It can also help in music-speech separation from audio for karaoke development. In this paper, a system to segregate songs and speech has been proposed using Line Spectral Pair based features. The system has been tested on a database of 19374 clips and a highest accuracy of 99.88% has been obtained with Ensemble Learning based classification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Al-Shoshan, A.I.: Speech and music classification and separation: a review. J. King Saud Univ. 19(1), 95–133 (2006)
Gao, T., Du, J., Dai, L.R., Lee, C.H.: Joint training of front-end and back-end deep neural networks for robust speech recognition. In: Proceedings of ICASSP-2015, pp. 4375–4379 (2015)
Giri, R., Seltzer, M.L., Droppo, J., Yu, D.: Improving speech recognition in reverberation using a room-aware deep neural network and multitask learning. In: Proceedings of ICASSP-2015, pp. 5014–5018 (2015)
Ritter, M., Mueller, M., Stueker, S., Metze, F., Waibel, A.: Training deep neural networks for reverberation robust speech recognition. In: ITG Symposium on Speech Communication, pp. 1–5 (2016)
Wold, E., Blum, T., Keislar, D., Wheaten, J.: Content-based classification, search, and retrieval of audio. IEEE Multimedia 3(3), 27–36 (1996)
Mazzoni, D., Dannenberg, R.B.: Melody matching directly from audio. In: Proceedings of ISMIR-2001, pp. 17–18 (2001)
Foote, J.T.: Content-based retrieval of music and audio. In: Multimedia Storage and Archiving Systems II, pp. 138–148 (1997)
Prakash, K., Hepzibha, R.D.: Blind source separation for speech music and speech speech mixtures. Int. J. Comput. Appl. 110(12), 40–43 (2015)
Gerhard, D.B.: Computationally measurable differences between speech and song. Doctoral dissertation, School of Computing Science, Simon Fraser University (2003)
Ghosal, A., Chakraborty, R., Dhara, B.C., Saha, S.K.: A hierarchical approach for speech-instrumental-song classification. SpringerPlus 2(1), 526 (2013)
Rong, F.: Audio classification method based on machine learning. In: Proceedings of ICITBS-2016, pp. 81–84 (2016)
Saunders, J.: Real-time discrimination of broadcast speech/music. In: Proceedings of ICASSP-1996, vol. 2, pp. 993–996 (1996)
Sadjadi, S.O., Ahadi, S.M., Hazrati, O.: Unsupervised speech/music classification using one-class support vector machines. In: Proceedings of ICICS-2007, pp. 1–5 (2007)
Thoshkahna, B., Sudha, V., Ramakrishnan, K.R.: A speech-music discriminator using HILN model based features. In: Proceedings of ICASSP-2006, vol. 5, pp. V 425-V 428 (2006)
Ethnologue. http://www.ethnologue.com. Accessed 1 Sept 2017
Youtube. http://www.youtube.com. Accessed 1 Sept 2017
Mukherjee, H., Rakshit, P., Phadikar, S., Roy, K.: REARC-a Bangla phoneme recognizer. In: Proceedings of ICADW-2016, pp. 177–180 (2016)
Paliwal, K.K.: On the use of line spectral frequency parameters for speech recognition. Digit. Signal Process. 2(2), 80–87 (1992)
Breiman, L.: Random forests. Machine Learn. 45(1), 5–32 (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Mukherjee, H., Phadikar, S., Roy, K. (2018). Segregation of Speech and Songs - A Precursor to Audio Interactive Applications. In: Mandal, J., Sinha, D. (eds) Social Transformation – Digital Way. CSI 2018. Communications in Computer and Information Science, vol 836. Springer, Singapore. https://doi.org/10.1007/978-981-13-1343-1_5
Download citation
DOI: https://doi.org/10.1007/978-981-13-1343-1_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1342-4
Online ISBN: 978-981-13-1343-1
eBook Packages: Computer ScienceComputer Science (R0)