A Semi-supervised Speaker Identification Method for Audio Forensics Using Cochleagrams

Camacho, Steven; Renza, Diego; Ballesteros L., Dora M.

doi:10.1007/978-3-319-66963-2_6

A Semi-supervised Speaker Identification Method for Audio Forensics Using Cochleagrams

Steven Camacho¹³,
Diego Renza¹³ &
Dora M. Ballesteros L.¹³

Conference paper
First Online: 29 August 2017

6831 Accesses
2 Citations

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 742))

Abstract

The general task in speaker identification for audio forensics is to identify the unknown speaker within an audio proof, who is suspected of a crime. Here, the voice of each person within a group of suspects is compared to the audio proof with the aim to determining which of them corresponds to the source. In this paper, a semi-supervised speaker identification method is proposed, which does not require a training stage. Also, the feature extraction is based on the use of cochleagrams for the previously selected words. The system can identify one or multiple suspects which have high similarity to the audio proof, or give a null response if none of the suspects satisfies a similarity threshold. The results of the proposed method are compared with the respective results of the same method but using spectrograms instead of cochleagrams. The performance of our system is measured through a confusion matrix (true and false positives, and true and false negatives) and global results are given in terms of overall accuracy and kappa index. According to several tests, our system has an overall accuracy higher than 0.97 and a kappa index around 0.78; this means a high confidence in the results of identification and rejection.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Ajmera, P.K., Jadhav, D.V., Holambe, R.S.: Text-independent speaker identification using radon and discrete cosine transforms based features from speech spectrogram. Pattern Recogn. 44(10), 2749–2759 (2011)
Article Google Scholar
Alegre, F.L.: Application of ANN and HMM to automatic speaker verification. IEEE Lat. Am. Trans. 5(5), 329–337 (2007)
Article Google Scholar
Almaadeed, N., Aggoun, A., Amira, A.: Speaker identification using multimodal neural networks and wavelet analysis. IET Biometrics 4(1), 18–28 (2015)
Article Google Scholar
Ballesteros, L., Renza, D., Camacho, S.: An unconditionally secure speech scrambling scheme based on an imitation process to a Gaussian noise signal. J. Inf. Hiding Multimedia Sig. Process 7(2), 233–242 (2016)
Google Scholar
Ballesteros, L.D.M., Moreno, A.J.M.: A bit more on the ability of adaptation of speech signals. Revista Facultad de Ingeniería Universidad de Antioquia 66, 82–90 (2013)
Google Scholar
Campbell, J.P., Shen, W., Campbell, W.M., Schwartz, R., Bonastre, J.F., Matrouf, D.: Forensic speaker recognition. IEEE Signal Process. Mag. 26(2), 95–103 (2009)
Google Scholar
Daqrouq, K., Tutunji, T.A.: Speaker identification using vowels features through a combined method of formants, wavelets, and neural network classifiers. Appl. Soft Comput. 27, 231–239 (2015)
Article Google Scholar
Day, P., Nandi, A.K.: Robust text-independent speaker verification using genetic programming. IEEE Trans. Audio Speech Lang. Process. 15(1), 285–295 (2007)
Article Google Scholar
Devika, A., Sumithra, M., Deepika, A.: A fuzzy-GMM classifier for multilingual speaker identification. In: 2014 International Conference on Communications and Signal Processing (ICCSP 2014), pp. 1514–1518. IEEE (2014)
Google Scholar
Hansen, J.H., Hasan, T.: Speaker recognition by machines and humans: a tutorial review. IEEE Signal Process. Mag. 32(6), 74–99 (2015)
Article Google Scholar
Hu, Y., Wu, D., Nucci, A.: Fuzzy-clustering-based decision tree approach for large population speaker identification. IEEE Trans. Audio Speech Lang. Process. 21(4), 762–774 (2013)
Article Google Scholar
Maher, R.C.: Audio forensic examination. IEEE Signal Process. Mag. 26(2), 84–94 (2009)
Article Google Scholar
Morrison, G.S., Sahito, F.H., Jardine, G., Djokic, D., Clavet, S., Berghs, S., Dorny, C.G.: Interpol survey of the use of speaker identification by law enforcement agencies. Forensic Sci. Int. 263, 92–100 (2016)
Article Google Scholar
Nemati, S., Basiri, M.E.: Text-independent speaker verification using ant colony optimization-based selected features. Expert Syst. Appl. 38(1), 620–630 (2011)
Article Google Scholar
Univaso, P., Ale, J.M., Gurlekian, J.A.: Data mining applied to forensic speaker identification. IEEE Lat. Am. Trans. 13(4), 1098–1111 (2015)
Article Google Scholar
Wu, J.D., Tsai, Y.J.: Speaker identification system using empirical mode decomposition and an artificial neural network. Expert Syst. Appl. 38(5), 6112–6117 (2011)
Article Google Scholar
Wu, Z., Evans, N., Kinnunen, T., Yamagishi, J., Alegre, F., Li, H.: Spoofing and countermeasures for speaker verification: a survey. Speech Commun. 66, 130–153 (2015)
Article Google Scholar
Xing, Y., Li, H., Tan, P.: Hierarchical fuzzy speaker identification based on FCM and FSVM. In: 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), pp. 311–315. IEEE (2012)
Google Scholar
Zhao, X., Shao, Y., Wang, D.: Casa-based robust speaker identification. IEEE Trans. Audio Speech Lang. Process. 20(5), 1608–1616 (2012)
Article Google Scholar

Download references

Acknowledgment

This work is supported by the “Universidad Militar Nueva Granada-Vicerrectoría de Investigaciones” under the grant IMP-ING-2136 of 2016.

Author information

Authors and Affiliations

Universidad Militar Nueva Granada, Bogotá, Colombia
Steven Camacho, Diego Renza & Dora M. Ballesteros L.

Authors

Steven Camacho
View author publications
You can also search for this author in PubMed Google Scholar
Diego Renza
View author publications
You can also search for this author in PubMed Google Scholar
Dora M. Ballesteros L.
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Steven Camacho .

Editor information

Editors and Affiliations

Universidad Distrital Francisco José de Caldas, Bogota, Colombia
Juan Carlos Figueroa-García
Universidad Distrital Francisco José de Caldas, Bogota, Colombia
Eduyn Ramiro López-Santana
Universidad Tecnológica de Bolívar, Cartagena, Colombia
José Luis Villa-Ramírez
Universidad Distrital Francisco José de Caldas, Bogota, Colombia
Roberto Ferro-Escobar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Camacho, S., Renza, D., Ballesteros L., D.M. (2017). A Semi-supervised Speaker Identification Method for Audio Forensics Using Cochleagrams. In: Figueroa-García, J., López-Santana, E., Villa-Ramírez, J., Ferro-Escobar, R. (eds) Applied Computer Sciences in Engineering. WEA 2017. Communications in Computer and Information Science, vol 742. Springer, Cham. https://doi.org/10.1007/978-3-319-66963-2_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-66963-2_6
Published: 29 August 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-66962-5
Online ISBN: 978-3-319-66963-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics