Skip to main content

A Semi-supervised Speaker Identification Method for Audio Forensics Using Cochleagrams

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 742))

Abstract

The general task in speaker identification for audio forensics is to identify the unknown speaker within an audio proof, who is suspected of a crime. Here, the voice of each person within a group of suspects is compared to the audio proof with the aim to determining which of them corresponds to the source. In this paper, a semi-supervised speaker identification method is proposed, which does not require a training stage. Also, the feature extraction is based on the use of cochleagrams for the previously selected words. The system can identify one or multiple suspects which have high similarity to the audio proof, or give a null response if none of the suspects satisfies a similarity threshold. The results of the proposed method are compared with the respective results of the same method but using spectrograms instead of cochleagrams. The performance of our system is measured through a confusion matrix (true and false positives, and true and false negatives) and global results are given in terms of overall accuracy and kappa index. According to several tests, our system has an overall accuracy higher than 0.97 and a kappa index around 0.78; this means a high confidence in the results of identification and rejection.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Ajmera, P.K., Jadhav, D.V., Holambe, R.S.: Text-independent speaker identification using radon and discrete cosine transforms based features from speech spectrogram. Pattern Recogn. 44(10), 2749–2759 (2011)

    Article  Google Scholar 

  2. Alegre, F.L.: Application of ANN and HMM to automatic speaker verification. IEEE Lat. Am. Trans. 5(5), 329–337 (2007)

    Article  Google Scholar 

  3. Almaadeed, N., Aggoun, A., Amira, A.: Speaker identification using multimodal neural networks and wavelet analysis. IET Biometrics 4(1), 18–28 (2015)

    Article  Google Scholar 

  4. Ballesteros, L., Renza, D., Camacho, S.: An unconditionally secure speech scrambling scheme based on an imitation process to a Gaussian noise signal. J. Inf. Hiding Multimedia Sig. Process 7(2), 233–242 (2016)

    Google Scholar 

  5. Ballesteros, L.D.M., Moreno, A.J.M.: A bit more on the ability of adaptation of speech signals. Revista Facultad de Ingeniería Universidad de Antioquia 66, 82–90 (2013)

    Google Scholar 

  6. Campbell, J.P., Shen, W., Campbell, W.M., Schwartz, R., Bonastre, J.F., Matrouf, D.: Forensic speaker recognition. IEEE Signal Process. Mag. 26(2), 95–103 (2009)

    Google Scholar 

  7. Daqrouq, K., Tutunji, T.A.: Speaker identification using vowels features through a combined method of formants, wavelets, and neural network classifiers. Appl. Soft Comput. 27, 231–239 (2015)

    Article  Google Scholar 

  8. Day, P., Nandi, A.K.: Robust text-independent speaker verification using genetic programming. IEEE Trans. Audio Speech Lang. Process. 15(1), 285–295 (2007)

    Article  Google Scholar 

  9. Devika, A., Sumithra, M., Deepika, A.: A fuzzy-GMM classifier for multilingual speaker identification. In: 2014 International Conference on Communications and Signal Processing (ICCSP 2014), pp. 1514–1518. IEEE (2014)

    Google Scholar 

  10. Hansen, J.H., Hasan, T.: Speaker recognition by machines and humans: a tutorial review. IEEE Signal Process. Mag. 32(6), 74–99 (2015)

    Article  Google Scholar 

  11. Hu, Y., Wu, D., Nucci, A.: Fuzzy-clustering-based decision tree approach for large population speaker identification. IEEE Trans. Audio Speech Lang. Process. 21(4), 762–774 (2013)

    Article  Google Scholar 

  12. Maher, R.C.: Audio forensic examination. IEEE Signal Process. Mag. 26(2), 84–94 (2009)

    Article  Google Scholar 

  13. Morrison, G.S., Sahito, F.H., Jardine, G., Djokic, D., Clavet, S., Berghs, S., Dorny, C.G.: Interpol survey of the use of speaker identification by law enforcement agencies. Forensic Sci. Int. 263, 92–100 (2016)

    Article  Google Scholar 

  14. Nemati, S., Basiri, M.E.: Text-independent speaker verification using ant colony optimization-based selected features. Expert Syst. Appl. 38(1), 620–630 (2011)

    Article  Google Scholar 

  15. Univaso, P., Ale, J.M., Gurlekian, J.A.: Data mining applied to forensic speaker identification. IEEE Lat. Am. Trans. 13(4), 1098–1111 (2015)

    Article  Google Scholar 

  16. Wu, J.D., Tsai, Y.J.: Speaker identification system using empirical mode decomposition and an artificial neural network. Expert Syst. Appl. 38(5), 6112–6117 (2011)

    Article  Google Scholar 

  17. Wu, Z., Evans, N., Kinnunen, T., Yamagishi, J., Alegre, F., Li, H.: Spoofing and countermeasures for speaker verification: a survey. Speech Commun. 66, 130–153 (2015)

    Article  Google Scholar 

  18. Xing, Y., Li, H., Tan, P.: Hierarchical fuzzy speaker identification based on FCM and FSVM. In: 2012 9th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD), pp. 311–315. IEEE (2012)

    Google Scholar 

  19. Zhao, X., Shao, Y., Wang, D.: Casa-based robust speaker identification. IEEE Trans. Audio Speech Lang. Process. 20(5), 1608–1616 (2012)

    Article  Google Scholar 

Download references

Acknowledgment

This work is supported by the “Universidad Militar Nueva Granada-Vicerrectoría de Investigaciones” under the grant IMP-ING-2136 of 2016.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Steven Camacho .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Camacho, S., Renza, D., Ballesteros L., D.M. (2017). A Semi-supervised Speaker Identification Method for Audio Forensics Using Cochleagrams. In: Figueroa-García, J., López-Santana, E., Villa-Ramírez, J., Ferro-Escobar, R. (eds) Applied Computer Sciences in Engineering. WEA 2017. Communications in Computer and Information Science, vol 742. Springer, Cham. https://doi.org/10.1007/978-3-319-66963-2_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-66963-2_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-66962-5

  • Online ISBN: 978-3-319-66963-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics