Abstract
The general task in speaker identification for audio forensics is to identify the unknown speaker within an audio proof, who is suspected of a crime. Here, the voice of each person within a group of suspects is compared to the audio proof with the aim to determining which of them corresponds to the source. In this paper, a semi-supervised speaker identification method is proposed, which does not require a training stage. Also, the feature extraction is based on the use of cochleagrams for the previously selected words. The system can identify one or multiple suspects which have high similarity to the audio proof, or give a null response if none of the suspects satisfies a similarity threshold. The results of the proposed method are compared with the respective results of the same method but using spectrograms instead of cochleagrams. The performance of our system is measured through a confusion matrix (true and false positives, and true and false negatives) and global results are given in terms of overall accuracy and kappa index. According to several tests, our system has an overall accuracy higher than 0.97 and a kappa index around 0.78; this means a high confidence in the results of identification and rejection.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Ajmera, P.K., Jadhav, D.V., Holambe, R.S.: Text-independent speaker identification using radon and discrete cosine transforms based features from speech spectrogram. Pattern Recogn. 44(10), 2749–2759 (2011)
Alegre, F.L.: Application of ANN and HMM to automatic speaker verification. IEEE Lat. Am. Trans. 5(5), 329–337 (2007)
Almaadeed, N., Aggoun, A., Amira, A.: Speaker identification using multimodal neural networks and wavelet analysis. IET Biometrics 4(1), 18–28 (2015)
Ballesteros, L., Renza, D., Camacho, S.: An unconditionally secure speech scrambling scheme based on an imitation process to a Gaussian noise signal. J. Inf. Hiding Multimedia Sig. Process 7(2), 233–242 (2016)
Ballesteros, L.D.M., Moreno, A.J.M.: A bit more on the ability of adaptation of speech signals. Revista Facultad de Ingeniería Universidad de Antioquia 66, 82–90 (2013)
Campbell, J.P., Shen, W., Campbell, W.M., Schwartz, R., Bonastre, J.F., Matrouf, D.: Forensic speaker recognition. IEEE Signal Process. Mag. 26(2), 95–103 (2009)
Daqrouq, K., Tutunji, T.A.: Speaker identification using vowels features through a combined method of formants, wavelets, and neural network classifiers. Appl. Soft Comput. 27, 231–239 (2015)
Day, P., Nandi, A.K.: Robust text-independent speaker verification using genetic programming. IEEE Trans. Audio Speech Lang. Process. 15(1), 285–295 (2007)
Devika, A., Sumithra, M., Deepika, A.: A fuzzy-GMM classifier for multilingual speaker identification. In: 2014 International Conference on Communications and Signal Processing (ICCSP 2014), pp. 1514–1518. IEEE (2014)
Hansen, J.H., Hasan, T.: Speaker recognition by machines and humans: a tutorial review. IEEE Signal Process. Mag. 32(6), 74–99 (2015)
Hu, Y., Wu, D., Nucci, A.: Fuzzy-clustering-based decision tree approach for large population speaker identification. IEEE Trans. Audio Speech Lang. Process. 21(4), 762–774 (2013)
Maher, R.C.: Audio forensic examination. IEEE Signal Process. Mag. 26(2), 84–94 (2009)
Morrison, G.S., Sahito, F.H., Jardine, G., Djokic, D., Clavet, S., Berghs, S., Dorny, C.G.: Interpol survey of the use of speaker identification by law enforcement agencies. Forensic Sci. Int. 263, 92–100 (2016)
Nemati, S., Basiri, M.E.: Text-independent speaker verification using ant colony optimization-based selected features. Expert Syst. Appl. 38(1), 620–630 (2011)
Univaso, P., Ale, J.M., Gurlekian, J.A.: Data mining applied to forensic speaker identification. IEEE Lat. Am. Trans. 13(4), 1098–1111 (2015)
Wu, J.D., Tsai, Y.J.: Speaker identification system using empirical mode decomposition and an artificial neural network. Expert Syst. Appl. 38(5), 6112–6117 (2011)
Wu, Z., Evans, N., Kinnunen, T., Yamagishi, J., Alegre, F., Li, H.: Spoofing and countermeasures for speaker verification: a survey. Speech Commun. 66, 130–153 (2015)
Xing, Y., Li, H., Tan, P.: Hierarchical fuzzy speaker identification based on FCM and FSVM. In: 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), pp. 311–315. IEEE (2012)
Zhao, X., Shao, Y., Wang, D.: Casa-based robust speaker identification. IEEE Trans. Audio Speech Lang. Process. 20(5), 1608–1616 (2012)
Acknowledgment
This work is supported by the “Universidad Militar Nueva Granada-Vicerrectoría de Investigaciones” under the grant IMP-ING-2136 of 2016.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Camacho, S., Renza, D., Ballesteros L., D.M. (2017). A Semi-supervised Speaker Identification Method for Audio Forensics Using Cochleagrams. In: Figueroa-García, J., López-Santana, E., Villa-Ramírez, J., Ferro-Escobar, R. (eds) Applied Computer Sciences in Engineering. WEA 2017. Communications in Computer and Information Science, vol 742. Springer, Cham. https://doi.org/10.1007/978-3-319-66963-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-66963-2_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66962-5
Online ISBN: 978-3-319-66963-2
eBook Packages: Computer ScienceComputer Science (R0)