Fundamental Frequency Extraction in Speech Emotion Recognition

Stasiak, Bartłomiej; Rychlicki-Kicior, Krzysztof

doi:10.1007/978-3-642-30721-8_29

Bartłomiej Stasiak³ &
Krzysztof Rychlicki-Kicior³

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 287))

Included in the following conference series:

International Conference on Multimedia Communications, Services and Security

1308 Accesses
5 Citations

Abstract

Emotion recognition in a speech signal has received much attention recently, due to its usefulness in many applications associated with human – computer interaction. Fundamental frequency recognition in a speech signal is one of the most crucial factors in successful emotion recognition. In this work, parameters of an autocorrelation – based algorithm for fundamental frequency detection are analysed on the example of Berlin emotion speech database (EMO-DB). The obtained results show that lower-than-standard values of the upper limit of the analysed frequency range tend to improve the classification outcome. Statistics of prosody contours and Mel-frequency cepstral coefficients (MFCC) have been used for feature set construction and support vector machine (SVM) has been used as a classifier, yielding high recognition rates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Dziubiński, M., Kostek, B.: High accuracy and octave error immune pitch detection algorithms. Archives of Acoustics 29(1), 1–21 (2004)
Google Scholar
Gerhard, D.: Pitch Extraction and Fundamental Frequency: History and Current Techniques. Technical Report TR-CS 2003-06, Dept. of Computer Science, University of Regina (2003)
Google Scholar
Paeschke, A.: Global Trend of Fundamental Frequency in Emotional Speech. In: Proceedings of Speech Prosody, Nara, Japan (2004)
Google Scholar
Boersma, P.: Accurate short-term analysis of the fundamental frequency and the harmonics-to-noise ratio of a sampled sound. In: IFA Proceedings 17 (1993)
Google Scholar
Boersma, P.: Praat, a system for doing phonetics by computer. Glot International 5(9/10), 341–345 (2001)
Google Scholar
Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W., Weiss, B.: A Database of German Emotional Speech. In: Proceedings Interspeech, Portugal (2005)
Google Scholar
Ververidis, D., Kotropoulos, C.: Emotional speech recognition: Resources, features, and methods. Speech Communication 48(9) (2006)
Google Scholar
Neiberg, D., Elenius, K., Karlsson, I., Laskowski, K.: Emotion Recognition in Spontaneous Speech. Working Papers 52, University of Lund (2006)
Google Scholar
Niewiadomy, D., Pelikant, A.: Digital Speech Signal Parametrization by Mel Frequency Cepstral Coefficients and Word Boundaries. Journal of Applied Computer Science 15(2), 71–81 (2007)
Google Scholar
Mao, X., Chen, L., Zhang, B.: Mandarin speech emotion recognition based on a hybrid HMM/ANN. International Journal of Computers 1(4) (2007)
Google Scholar
Nogueiras, A., Moreno, A., Bonafonte, A., Mariño, J.B.: Speech Emotion Recognition Using Hidden Markov Models. In: 7th European Conference on Speech Communication and Technology, Aalborg, Denmark (2001)
Google Scholar
Mansoorizadeh, M., Charkari, N.M.: Speech emotion recognition: comparison of speech segmentation approaches. In: IKT 2007 (2007)
Google Scholar
Datcu, D., Rothkrantz, L.J.M.: The recognition of emotions from speech using GentleBoost classifier. A comparison approach. In: International Conference on Computer Systems and Technologies (2006)
Google Scholar
Koolagudi, S.G., Rao, K.S.: Real life emotion classification using VOP and pitch based spectral features. In: India Conference (INDICON) Annual IEEE (2010)
Google Scholar
Prasanna, S.R.M., Reddy, B.V.S., Krishnamoorthy, P.: Vowel onset point detection using source, spectral peaks, and modulation spectrum energies. IEEE Trans. Audio, Speech, and Language Processing 17, 556–565 (2009)
Article Google Scholar
Murty, K.S.R., Yegnanarayana, B.: Epoch extraction from speech signals. IEEE Trans. Audio, Speech, Language Processing 16(8), 1602–1615 (2008)
Article Google Scholar
Hahn, M., Kang, D.G.: Precise glottal closure instant detector for voiced speech. IEE Electronics Letters 32(23) (1996)
Google Scholar
Shami, M.T., Kamel, M.S.: Segment-based approach to the recognition of emotions in speech. In: ICME (2005)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2, 27:1–27:27 (2011)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA Data Mining Software: An Update. SIGKDD Explorations 11(1) (2009)
Google Scholar
Xuedong, H., Acero, A., Hon, H.W.: Spoken Language Processing. Prentice Hall PTR (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Information Technology, Technical University of Łódź, ul. Wólczańska 215, 90-924, Łódź, Poland
Bartłomiej Stasiak & Krzysztof Rychlicki-Kicior

Authors

Bartłomiej Stasiak
View author publications
You can also search for this author in PubMed Google Scholar
Krzysztof Rychlicki-Kicior
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Telecommunications, AGH University of Science and Technology, Krakow, Poland
Andrzej Dziech
Multimedia Systems Department, Gdansk University of Technology, Narutowicza 11/22, 80-233, Gdansk, Poland
Andrzej Czyżewski

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Stasiak, B., Rychlicki-Kicior, K. (2012). Fundamental Frequency Extraction in Speech Emotion Recognition. In: Dziech, A., Czyżewski, A. (eds) Multimedia Communications, Services and Security. MCSS 2012. Communications in Computer and Information Science, vol 287. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30721-8_29

Download citation

DOI: https://doi.org/10.1007/978-3-642-30721-8_29
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30720-1
Online ISBN: 978-3-642-30721-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics