Speaker recognition: an enhanced approach to identify singer voice using neural network

Biswas, Sharmila; Solanki, Sandeep Singh

doi:10.1007/s10772-020-09698-8

Speaker recognition: an enhanced approach to identify singer voice using neural network

Published: 16 March 2020

Volume 24, pages 9–21, (2021)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Sharmila Biswas¹ &
Sandeep Singh Solanki¹

422 Accesses
10 Citations
Explore all metrics

Abstract

Communication with people is the most common phenomena of human. Mostly they can recognize the voice of their known one. Even the same thing is seen while recognizing a voice in the music. If the voice of the artist is known, then the recognition will be the easier one, but if the voice is not very familiar to the listener, it will be a tough job to identify the voice within music. Thus, singer recognition is one of the demanding areas of research by the implication of eligible algorithms in the domain of audio signal processing. There are different approaches that can be made for fulfilling the objective by attaining the goal of truncating the voice frequency range from the audio signal or it may be the detection of the peaks of the voice within that music. As music is polyphonic, so, the essential analysis is required to check for the frequency components and thereby detecting the peaks of the voice signal which can be an easier approach for such detection. In this paper, some songs are taken into consideration to create the training data and through which the neural network is trained. With that training data, a separate set of data is prepared which is used for testing. Apart from the application of the supervised learning procedure, with the implication of hyper parameter tuning, the efficiency is observed for the detection of the new and unknown signer to be detected. Essentially, the neural network works in this field fairly with about 99.29% accuracy and thus the detection is made with a satisfactory level.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic speech recognition: a survey

Article 10 November 2020

A comprehensive survey on automatic speech recognition using neural networks

Article 15 August 2023

A Deep Learning Framework for Audio Deepfake Detection

Article 08 November 2021

References

Bayle, Y., Maršík, L., Rusek, M., Robine, M., Hanna, P., Slaninová, K., Martinovic, J., & Pokorný, J. (2017). Kara1k: A karaoke dataset for cover song identification and singing voice analysis. In Proceedings of the IEEE International Symposium on Multimedia (ISM). https://doi.org/10.1109/ISM.2017.32.
Bogdanov, D., Porter, A., Herrera, P., & Serra, X. (2016). Cross-collection evaluation for music classification tasks. In Proceedings of the 17th Int. Soc. Music Inform. Retrieval Conf (pp. 379–385).
Eronen, A., & Klapuri, A. (2000). 'Musical instrument recognition using cepstral coefficients and temporal features.', In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 2000. ICASSP'OO, vol. 2 (pp. 11753–11756).
Hu, Y., & Liu, G. (2015). Separation of singing voice using nonnegative matrix partial co-factorization for singer identification. IEEE Transactions on Audio, Speech, and Language Processing, 23(4), 643–653.
Article Google Scholar
Kooshan, S., Fard, H., & Toroghi, R. M. (2019). Singer identification by vocal parts detection and singer classification using LSTM neural networks. In Proceedings of the 4th International Conference on Pattern Recognition and Image Analysis (IPRIA 2019) (pp. 246–250).
Kroher, N., Díaz-Báñez, J. M., Mora, J., & Gómez, E. (2015). Corpus COFLA: A research corpus for the computational study of Flamenco music. Journal on Computing and Cultural Heritage, 9(2), 1–24.
Article Google Scholar
Masood, S., Nayal, J. S., & Jain, R. K. (2016). Singer identification in indian hindi songs using MFCC and spectral features. In Proceedings of the 1st IEEE International Conference on Power Electronics. Intelligent Control and Energy Systems (ICPEICES-2016) (pp. 1–5). https://doi.org/10.1109/icpeices.2016.7853641.
Murthy, Y. V. S., Jeshventh, T. K. R., Zoeb, M., Saumyadip, M., & Shashidhar, G. K. (2018). Singer identification from smaller snippets of audio clips using acoustic features and DNNs. In Proceedings of the Eleventh International Conference on Contemporary Computing (IC3). https://doi.org/10.1109/IC3.2018.8530602.
Park, H., Nam, S., Choi, E. M., & Choi, D. (2018). Hidden singer: Distinguishing imitation singers based on training with only the original song. IEICE Transactions on Information and Systems. https://doi.org/10.1587/transinf.2018EDP7140.
Article Google Scholar
Patil, H., Radadia, P., & Basu, T. (2012). Combining evidences from mel cepstral features and cepstral mean subtracted features for singer identification. In Proceedings of the International Conference on Asian Language Processing (pp. 145–148). https://doi.org/10.1109/IALP.2012.33.
Srinivasa Murthy, Y. V., & Koolagudi, S. G. (2015). Classification of vocal and non-vocal regions from audio songs using spectral features and pitch variations. In Proceedings of the Canadian Conference on Electrical and Computer Engineering (pp. 1271–1276). https://doi.org/10.1109/CCECE.2015.7129461
Zhu, B., Li, W., Li, R., & Xue, X. (2013). Multi-stage non-negative matrix factorization for monaural singing voice separation. IEEE Transactions on Audio, Speech, and Language Processing, 21, 2096–2107.
Article Google Scholar

Download references

Author information

Authors and Affiliations

ECE Department, BIT-Mesra, Ranchi, Jharkhand, India
Sharmila Biswas & Sandeep Singh Solanki

Authors

Sharmila Biswas
View author publications
You can also search for this author in PubMed Google Scholar
Sandeep Singh Solanki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sharmila Biswas.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Biswas, S., Solanki, S.S. Speaker recognition: an enhanced approach to identify singer voice using neural network. Int J Speech Technol 24, 9–21 (2021). https://doi.org/10.1007/s10772-020-09698-8

Download citation

Received: 02 January 2020
Accepted: 06 March 2020
Published: 16 March 2020
Issue Date: March 2021
DOI: https://doi.org/10.1007/s10772-020-09698-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Speaker recognition: an enhanced approach to identify singer voice using neural network

Abstract

Access this article

Similar content being viewed by others

Automatic speech recognition: a survey

A comprehensive survey on automatic speech recognition using neural networks

A Deep Learning Framework for Audio Deepfake Detection

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Speaker recognition: an enhanced approach to identify singer voice using neural network

Abstract

Access this article

Similar content being viewed by others

Automatic speech recognition: a survey

A comprehensive survey on automatic speech recognition using neural networks

A Deep Learning Framework for Audio Deepfake Detection

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation