Abstract
In this paper, various temporal features (i.e., zero crossing rate and short-time energy) and spectral features (spectral flux and spectral centroid) have been derived from the Teager energy operator (TEO) profile of the speech waveform. The efficacy of these features has been analyzed for the classification of normal and dysphonic voices by comparing their performance with the features derived from the linear prediction (LP) residual and the speech waveform. In addition, the effectiveness of fusing these features with state-of-the-art Mel frequency cepstral coefficients (MFCC) feature-set has also been investigated to understand whether these features provide complementary results. The classifier that has been used is the 2nd order polynomial classifier, with experiments being carried out on a subset of the Massachusetts Eye and Ear Infirmary (MEEI) database.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Davis, S.B.: Acoustic Characteristics of Normal and Pathological Voices. Haskins Labora-tories: Status Report on Speech Research 54, 133–164 (1978)
Parsa, V., Jamieson, D.G.: Identification of Pathological Voices Using Glottal Noise Measures. J. Speech, Language, Hearing Res. 43(2), 469–485 (2000)
Teager, H.M., Teager, S.M.: Evidence for Nonlinear Sound Production Mechanisms in the Vocal Tract. In: Hardcastle, W.J., Marchal, A. (eds.) Speech Production and Speech Modelling, pp. 241–261. Kluwer, Netherlands (1990)
CMU-ARCTIC speech synthesis databases, http://festvox.org/cmu_arctic/index.html
Markaki, M., Stylianou, Y., Arias-Londoño, J.D., Godino-Llorente, J.I.: Dysphonia Detec-tion Based on Modulation Spectral Features and Cepstral Coefficients. In: EEE Proc. Int. Conf. Acoust., Speech, Signal Processing (ICASSP), pp. 5162–5165 (2010)
Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals. Prentice Hall, Englewood Cliffs (1978)
Low, L.S.A., Maddage, N.C., Lech, M., Sheeber, L., Allen, N.: Influence of Acoustic Low-Level Descriptors in the Detection of Clinical Depression in Adolescents. In: IEEE Proc. Int. Conf. Acoust., Speech, Signal Processing (ICASSP), pp. 5154–5157 (2010)
Paliwal, K.K.: Spectral Subband Centroid Features for Speech Recognition. In: IEEE Proc. Int. Conf. Acoust., Speech, Signal Processing, ICASSP (1998)
Hossienzadeh, D., Krishnan, S.: Combining Vocal Source and MFCC Features for En-hanced Speaker Recognition Performance using GMMs. In: Proc of IEEE 9th Workshop on Multimedia Signal Processing, pp. 365–368 (2007)
Kay Elemetrics Corp, Disordered Voice Database Model 4337, Version 1.03, Massachusetts Eye and Ear Infirmary Voice and Speech Lab (2002)
Campbell, W.M., Assaleh, K.T., Broun, C.C.: Speaker Recognition with Polynomial Classifiers. IEEE Transactions on Speech and Audio Processing 10(4), 205–212 (2002)
Martin, A.F., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The DET Curve in Assessment of Detection Task Performance. In: Proc. Eurospeech 1997, Rhodes, Greece, vol. 4, pp. 1899–1903 (1997)
Davis, S.B., Mermelstein, P.: Comparison on Parametric Representation for Monosyl-labic Word Recognition in Continuously Spoken Sentences. IEEE, Transactions on Acoustics, Speech, And Signal Processing ASSP-28(4), 357–366 (1980)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag GmbH Berlin Heidelberg
About this chapter
Cite this chapter
Patil, H.A., Baljekar, P.N., Basu, T.K. (2012). Novel Temporal and Spectral Features Derived from TEO for Classification Normal and Dysphonic Voices. In: Sambath, S., Zhu, E. (eds) Frontiers in Computer Education. Advances in Intelligent and Soft Computing, vol 133. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27552-4_76
Download citation
DOI: https://doi.org/10.1007/978-3-642-27552-4_76
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27551-7
Online ISBN: 978-3-642-27552-4
eBook Packages: EngineeringEngineering (R0)