Speech Emotional Recognition Using Global and Time Sequence Structure Features with MMD

Zhao, Li; Cao, Yujia; Wang, Zhiping; Zou, Cairong

doi:10.1007/11573548_40

Li Zhao^19,20,
Yujia Cao^19,20,
Zhiping Wang²⁰ &
…
Cairong Zou²⁰

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3784))

Included in the following conference series:

International Conference on Affective Computing and Intelligent Interaction

5014 Accesses
4 Citations

Abstract

In this paper, combined features of global and time-sequence were used as the characteristic parameters for speech emotional recognition. A new method based on formula of MMD (Modified Mahalanobis Distance) was proposed to decrease the estimated errors and simplify the calculation. Four emotions including happiness, anger, surprise and sadness are considered in the paper. 1000 recognizing sentences collected from 10 speakers were used to demonstrate the effectiveness of the new method. The average emotion recognition rate reached at 95%. Comparison with method of MQDF [1] (Modified quadratic discriminant function), Data analysis also displayed that the MMD is better than MQDF.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Cai, L., Jiang, C., Wang, Z., Zhao, L., Zou, C.: A Method Combining The Global And Time Series Structure Features For Emotion Recognition In Speech. In: IEEE Int. Conf. Neural Networks & Signal Processing (2003)
Google Scholar
Iida, A., Campbell, N., Iga, S., Higuchi, F., Yasumura, M.: Acoustic Nature and perceptual testing of corpora of emotional speech
Google Scholar
Banse, R., Scherer, K.R.: Acoustic profiles in vocal emotion expression. Journal of Personality and Social Psychology 70(3) (1996)
Google Scholar
Mozziconacc, S.: Speech Variability and Emotion: Production and Perception. Technische Universiteit Eindhoven, Eindhoven (1998)
Google Scholar
Scherer, K.R.: Speech and Emotional States. In: Darby, J.K. (ed.) Speech Evaluation in Psychiatry. Grune and Stratton, New York (1981)
Google Scholar
Soskin, W.F., Kauffman, P.E.: Judgements of Emotions in Word-free Voice Samples. Journal of Communication (1961)
Google Scholar
Li, Z., Xiangmin, Q., Cairong, Z., Zhenyang, W.: A Study on Emotional Recognition in Speech Signal. Journal of Software 12(7) (2001)
Google Scholar
Cowie, R.: Emotion Recognition in Human-Computer Interaction. IEEE Signal Processing Magazine 18(1), 32–80 (2001)
Article Google Scholar
Muraka, S.: Emotional Constituents in Text and Emotional Components in Speech, Ph. D. Theis, Kyoto, Kyoto Institute of Technology, Japan (1998)
Google Scholar
Shigenaga, M.: Features of Emotionally Uttered Speech Revealed by Discriminant Analysis (VI), The preprint of the acoustical society of Japan, pp. 2–18 (1999)
Google Scholar
Li, Z., Xiangmin, Q., Cairong, Z., Zhenyang, W.: A Study on Emotional Feature Analysis and Recognition in Speech Signal. Journal of China Institute of Communications 21(1), 18–25 (2000)
Google Scholar
Li, Z., Xiangmin, Q., Cairong, Z., Zhenyang, W.: A Study on Emotional Feature Extract in Speech signal. Data Collection and Process 15(1), 120–123 (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Research Center of Learning Science, Southeast University, Nanjing, 210096, China
Li Zhao & Yujia Cao
Department of Radio Engineering, Southeast University, Nanjing, 210096, China
Li Zhao, Yujia Cao, Zhiping Wang & Cairong Zou

Authors

Li Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Yujia Cao
View author publications
You can also search for this author in PubMed Google Scholar
Zhiping Wang
View author publications
You can also search for this author in PubMed Google Scholar
Cairong Zou
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences,
Jianhua Tao
National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China
Tieniu Tan
MIT Media Laboratory, 20 Ames Street, 02139, Cambridge, MA, USA
Rosalind W. Picard

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhao, L., Cao, Y., Wang, Z., Zou, C. (2005). Speech Emotional Recognition Using Global and Time Sequence Structure Features with MMD. In: Tao, J., Tan, T., Picard, R.W. (eds) Affective Computing and Intelligent Interaction. ACII 2005. Lecture Notes in Computer Science, vol 3784. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11573548_40

Download citation

DOI: https://doi.org/10.1007/11573548_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29621-8
Online ISBN: 978-3-540-32273-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics