Skip to main content

Discrimination Effectiveness of Speech Cepstral Features

  • Conference paper
Biometrics and Identity Management (BioID 2008)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5372))

Included in the following conference series:

Abstract

In this work, the discrimination capabilities of speech cepstra for text and speaker related information are investigated. For this purpose, Bhattacharya distance metric is used as the measure of discrimination. The scope of the study covers static and dynamic cepstra derived using the linear prediction analysis (LPCC) as well as mel-frequency analysis (MFCC). The investigations also include the assessment of the linear prediction-based mel-frequency cepstral coefficients (LP-MFCC) as an alternative speech feature type. It is shown experimentally that whilst contaminations in speech unfavourably affect the performance of all types of cepstra, the effects are more severe in the case of MFCC. Furthermore, it is shown that with a combination of static and dynamic features, LP-based mel-frequency cepstra (LP-MFCC) exhibit the best discrimination capabilities in almost all experimental cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Campbell, J.: Speaker Recognition: A tutorial. Proceedings of the IEEE 85(9), 1437–1462 (1997)

    Article  Google Scholar 

  2. Bimbot, F., et al.: An overview of the CAVE project research activities in speaker verification. Speech Communication 31(2–3), 155–180 (2000)

    Article  Google Scholar 

  3. Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing 28(4), 357–366 (1980)

    Article  Google Scholar 

  4. Nwe, T.L., Foo, S.W., De Silva, L.C.: Speech emotion recognition using hidden Markov models. Speech Communication 41(4), 603–623 (2003)

    Article  Google Scholar 

  5. O’Shaughnessy, D.: Speech Communication: Human and Machine. Addison-Wesley, Reading (1987)

    MATH  Google Scholar 

  6. Lee, K.F., Hon, H.W., Reddy, R.: An overview of the SPHINX speech recognition system. IEEE Transactions on Signal Processing 38, 35–45 (1990)

    Article  Google Scholar 

  7. Sivakumaran, P.: Robust Text Dependant Speaker Verification. Ph.D. thesis, University of Hertfordshire (1998)

    Google Scholar 

  8. Reynolds, D., Andrews, W., et al.: The SuperSID project: exploiting high-level information for high-accuracy speaker recognition. In: Proc. ICASSP 2003, vol. 4, pp. 784–784 (2003)

    Google Scholar 

  9. Johnson, S.: Speaker Tracking. M.Phil. Thesis, C.U.E.D. - University of Cambridge (1997)

    Google Scholar 

  10. Umesh, S., Cohen, L., Marinovic, N., Nelson, D.J.: Scale transform in speech analysis. IEEE Transactions on Speech and Audio Processing 7(1), 40–45 (1999)

    Article  Google Scholar 

  11. Liu, C.S., Huang, C.S., Lin, M.T., Wang, H.C.: Automatic speaker recognition based upon various distances of LSP frequencies. In: Proc. IEEE International Carnahan Conference on Security Technology, pp. 104–109 (1991)

    Google Scholar 

  12. Fukunaga, K.: Introduction to Statistical Pattern Recognition. Academic Press, London (1990)

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Malegaonkar, A., Ariyaeeinia, A., Sivakumaran, P., Pillay, S. (2008). Discrimination Effectiveness of Speech Cepstral Features. In: Schouten, B., Juul, N.C., Drygajlo, A., Tistarelli, M. (eds) Biometrics and Identity Management. BioID 2008. Lecture Notes in Computer Science, vol 5372. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89991-4_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-89991-4_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-89990-7

  • Online ISBN: 978-3-540-89991-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics