Discrimination Effectiveness of Speech Cepstral Features

Malegaonkar, A.; Ariyaeeinia, A.; Sivakumaran, P.; Pillay, S.

doi:10.1007/978-3-540-89991-4_10

A. Malegaonkar²⁰,
A. Ariyaeeinia²⁰,
P. Sivakumaran²⁰ &
…
S. Pillay²⁰

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 5372))

Included in the following conference series:

European Workshop on Biometrics and Identity Management

1306 Accesses
2 Citations

Abstract

In this work, the discrimination capabilities of speech cepstra for text and speaker related information are investigated. For this purpose, Bhattacharya distance metric is used as the measure of discrimination. The scope of the study covers static and dynamic cepstra derived using the linear prediction analysis (LPCC) as well as mel-frequency analysis (MFCC). The investigations also include the assessment of the linear prediction-based mel-frequency cepstral coefficients (LP-MFCC) as an alternative speech feature type. It is shown experimentally that whilst contaminations in speech unfavourably affect the performance of all types of cepstra, the effects are more severe in the case of MFCC. Furthermore, it is shown that with a combination of static and dynamic features, LP-based mel-frequency cepstra (LP-MFCC) exhibit the best discrimination capabilities in almost all experimental cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Campbell, J.: Speaker Recognition: A tutorial. Proceedings of the IEEE 85(9), 1437–1462 (1997)
Article Google Scholar
Bimbot, F., et al.: An overview of the CAVE project research activities in speaker verification. Speech Communication 31(2–3), 155–180 (2000)
Article Google Scholar
Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Transactions on Acoustics, Speech, and Signal Processing 28(4), 357–366 (1980)
Article Google Scholar
Nwe, T.L., Foo, S.W., De Silva, L.C.: Speech emotion recognition using hidden Markov models. Speech Communication 41(4), 603–623 (2003)
Article Google Scholar
O’Shaughnessy, D.: Speech Communication: Human and Machine. Addison-Wesley, Reading (1987)
MATH Google Scholar
Lee, K.F., Hon, H.W., Reddy, R.: An overview of the SPHINX speech recognition system. IEEE Transactions on Signal Processing 38, 35–45 (1990)
Article Google Scholar
Sivakumaran, P.: Robust Text Dependant Speaker Verification. Ph.D. thesis, University of Hertfordshire (1998)
Google Scholar
Reynolds, D., Andrews, W., et al.: The SuperSID project: exploiting high-level information for high-accuracy speaker recognition. In: Proc. ICASSP 2003, vol. 4, pp. 784–784 (2003)
Google Scholar
Johnson, S.: Speaker Tracking. M.Phil. Thesis, C.U.E.D. - University of Cambridge (1997)
Google Scholar
Umesh, S., Cohen, L., Marinovic, N., Nelson, D.J.: Scale transform in speech analysis. IEEE Transactions on Speech and Audio Processing 7(1), 40–45 (1999)
Article Google Scholar
Liu, C.S., Huang, C.S., Lin, M.T., Wang, H.C.: Automatic speaker recognition based upon various distances of LSP frequencies. In: Proc. IEEE International Carnahan Conference on Security Technology, pp. 104–109 (1991)
Google Scholar
Fukunaga, K.: Introduction to Statistical Pattern Recognition. Academic Press, London (1990)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

University of Hertfordshire, College Lane, Hatfield, Hertfordshire, AL10 9AB, UK
A. Malegaonkar, A. Ariyaeeinia, P. Sivakumaran & S. Pillay

Authors

A. Malegaonkar
View author publications
You can also search for this author in PubMed Google Scholar
A. Ariyaeeinia
View author publications
You can also search for this author in PubMed Google Scholar
P. Sivakumaran
View author publications
You can also search for this author in PubMed Google Scholar
S. Pillay
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

CWI/Fontys, 5600 AH, Eindhoven, The Netherlands
Ben Schouten
Roskilde University, 4000, Roskilde, Denmark
Niels Christian Juul
Swiss Federal Institute of Technology Lausanne (EPFL), 1015, Lausanne, Switzerland
Andrzej Drygajlo
University of Sassari, 07041, Alghero, Italy
Massimo Tistarelli

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Malegaonkar, A., Ariyaeeinia, A., Sivakumaran, P., Pillay, S. (2008). Discrimination Effectiveness of Speech Cepstral Features. In: Schouten, B., Juul, N.C., Drygajlo, A., Tistarelli, M. (eds) Biometrics and Identity Management. BioID 2008. Lecture Notes in Computer Science, vol 5372. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89991-4_10

Download citation

DOI: https://doi.org/10.1007/978-3-540-89991-4_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89990-7
Online ISBN: 978-3-540-89991-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics