Skip to main content

Phase Based Mel Frequency Cepstral Coefficients for Speaker Identification

  • Conference paper
  • First Online:
  • 1516 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 435))

Abstract

In this paper new Phase based Mel frequency Cepstral Coefficient (PMFCC) are used for speaker identification. GMM with VQ are used as a classifier for classification of speakers. The identification performance of proposed features is compared with identification performance of MFCC features and phase features. The performance of PMFCC features has been found superior compared to MFCC features and phase features. Ten Hindi digits database of fifty speakers is used for simulation of results. This paper also explore the usefulness of phase information for speaker recognition.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. D.A. Reynolds, and R.C. Rose, “Robust Text-Independent Speaker Identification using Gaussian Mixture Speaker Models,” IEEE Transactions on Speech and Audio Processing, vol. 3, no. 1, pp. 74–77, January 1995.

    Google Scholar 

  2. Md. Fozur Rahman Chowdhury, “Text independent distributed speaker identification and verification using GMM UBM speaker models for mobile communications,” 10th International Conference on Information Science, Signal Processing and Their Application, 2010, pp 57–60.

    Google Scholar 

  3. Tomi Kinnunen, Evgeny Karpov and Pasi Franti “Real-time speaker identification and verification”, IEEE Transaction on Audio, Speech and Language Processing, Vol. 14, No. 1, pp. 277–278, 2006.

    Google Scholar 

  4. L.R. Rabiner and B.H. Juang, Fundamentals of Speech Recognition, 1st ed., Pearson Education, Delhi, 2003.

    Google Scholar 

  5. J. Makhoul, “Linear prediction: A tutorial review,” Proc. of IEEE, vol. 63, no. 4, pp. 561–580, 19756.

    Google Scholar 

  6. R.C. Snell and F. Milinazzo, “Formant location from LPC Analysis data,” IEEE Transactions on Speech and Audio Processing, vol. 1, no. 2, pp. 129–134, Apr. 1993.

    Google Scholar 

  7. S.S. McCandless, “An algorithm for automatic formant extraction using linear prediction spectra,” IEEE Trans. On Acoustic, Speech and Signal Processing, ASSP-22, No. 2, pp. 135–141, 1974.

    Google Scholar 

  8. Pawan Kumar, Nitika Jakhanwal, Anirban Bhowmick, and Mahesh Chandra, “Gender Classification Using Pitch and Formants” International Conference on Communication, Computing &Security (ICCCS), February 12–14, 2011, Rourkela, Odisha, India, pp. 319–324.

    Google Scholar 

  9. J.D. Markel, “Digital inverse filtering-A new tool for formant trajectory estimation,” IEEE Trans. AU-20, pp. 129–1 37, 1972.

    Google Scholar 

  10. A. Holzapfel and Y. Stylianou, “Beat tracking using group delay based onset detection.” in ISMIR, 2008, pp. 653–658.

    Google Scholar 

  11. M. E. P. Davies and M. Plumbley, “Context-dependent beat tracking of musical audio,” IEEE Trans. on Audio, Speech, and Language Processing, vol. 15, no. 3, pp. 1009–1020, March 2007.

    Google Scholar 

  12. K. Hofbauer, G. Kubin, and W. Kleijn, “Speech watermarking for analog flat-fading bandpass channels,” IEEE Trans. on Audio, Speech, and Language Processing, vol. 17, no. 8, pp. 1624–1637, Nov. 2009.

    Google Scholar 

  13. I. Saratxaga, D. Erro, I. Hernez, I. Sainz, and E. Navas, “Use of harmonic phase information for polarity detection in speech signals.” in INTERSPEECH, 2009.

    Google Scholar 

  14. Munish Bhatia, Navpreet Singh, Amitpal Singh,” Speaker Accent Recognition by MFCC Using KNearest Neighbour Algorithm: A Different Approach”, in IJARCCE.2015.

    Google Scholar 

  15. Sumit Srivastava, Pratibha Nandi, G. Sahoo, Mahesh Chandra,” Formant Based Linear Prediction Coefficients for Speaker Identification”, SPIN 2014.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sumit Srivastava .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer India

About this paper

Cite this paper

Srivastava, S., Chandra, M., Sahoo, G. (2016). Phase Based Mel Frequency Cepstral Coefficients for Speaker Identification. In: Satapathy, S., Mandal, J., Udgata, S., Bhateja, V. (eds) Information Systems Design and Intelligent Applications. Advances in Intelligent Systems and Computing, vol 435. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2757-1_31

Download citation

  • DOI: https://doi.org/10.1007/978-81-322-2757-1_31

  • Published:

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-322-2756-4

  • Online ISBN: 978-81-322-2757-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics