Audio-Visual Emotion Recognition Based on Hidden Markov Model

Zhao, Jingxuan; Wu, Xiao; Jiang, Dongmei

doi:10.1007/978-3-642-25778-0_14

Jingxuan Zhao⁶,
Xiao Wu⁶ &
Dongmei Jiang⁶

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 129))

208 Accesses

Abstract

Recognizing emotions in the elderly and disabled people is an essential part in knowing whether they need help. In this paper, a study is presented for audio visual emotion recognition based on Hidden Markov Model (HMM). In the realm of audio visual emotion recognition, feature extraction of audio visual emotion and HMM training are very important issues. Emotion features of speech and facial image sequences are extracted andthe HTK toolkit is adopted to train the hidden Markov models for audio, visual and audio visualmulti-stream emotion recognition. In general, the recognition rates of audio-visual multi-stream HMMs are slightly higher than the audio only HMM and visual only HMM, and the recognition rates of negative emotions areslightly than positive emotions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

http://info.china.alibaba.com/news/detail/v5000441-d1004571420.html (2009)
Chennoukh, S., Gerrits, A., Miet, G., Sluijter, R.: Speech enhancement via frequency bandwidth extension using line spectral frequencies. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2001), vol. 1, pp. 665–668. IEEE (2001)
Google Scholar
Lei, X., Dongmei, J., Ravyse, I., Rongchun, Z., Sahli, H., Verhelst, W., Cornelis, J.: Experimental research on audio visual fusion and on model asynchrony for raising speech recognition rate. Journal of Northwestern Polytechnical University 2 (2004)
Google Scholar
Martin, O., Kotsia, I., Macq, B., Pitas, I.: The enterface’05 audio-visual emotion database. In: Proceedings of the 22nd International Conference on Data Engineering Workshops, p. 8. IEEE Computer Society (2006)
Google Scholar
Paleari, M., Benmokhtar, R., Huet, B.: Evidence theory-based multimodal emotion recognition. Advances in Multimedia Modeling, 435–446 (2009)
Google Scholar
Petrushin, V.: Emotion recognition in speech signal: experimental study, development, and application. In: Sixth International Conference on Spoken Language Processing (2000)
Google Scholar
Lin, Y., Wei, G., Yang, K.: A research of speech emotion recognition. Journal of Circuits and Systems 12(1), 90–98 (2007)
Google Scholar
Schuller, B., Rigoll, G., Lang, M.: Hidden markov model-based speech emotion recognition. In: Proceedings of the 2003 International Conference on Multimedia and Expo., vol. 2, pp. 401–404. IEEE Computer Society (2003)
Google Scholar
Sheng, W.: Audio emotion recognition based on hidden markov model. Heilongjiang Science and Technology Information (028), 2–2 (2010) (in Chinese)
Google Scholar
Young, S., Evermann, G., Kershaw, D., Moore, G., Odell, J., Ollason, D., Valtchev, V., Woodland, P.: The HTK book, Cambridge, vol. 2 (1999)
Google Scholar
Zeng, Z., Pantic, M., Roisman, G., Huang, T.: A survey of affect recognition methods: Audio, visual, and spontaneous expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(1), 39–58 (2009)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Northwestern Polytechnical University, Xi’an, China
Jingxuan Zhao, Xiao Wu & Dongmei Jiang

Authors

Jingxuan Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Wu
View author publications
You can also search for this author in PubMed Google Scholar
Dongmei Jiang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

, College of Communication Engineering, Jilin University, Room 313, Building No.1, Changchun, Nanhu Avenue 5372, Jilin, 130012, China, People's Republic
Zhihong Qian
, Department of Electrical Engineering, The University of Mississippi, Anderson Hall 314, Mississippi, 38677, Mississippi, USA
Lei Cao
, Department of Electrical and Computer En, Naval Postgraduate School, Rm. 452 Spanagel Bldg. 232, Dyer Road 833, Monterey, 93943-5121, California, USA
Weilian Su
, Faculty of Computing, London Metropolitan University, Holloway Road 166-220, London, N7 8DB, United Kingdom
Tingkai Wang
, College of Software, Changchun University of Science and Tech, Changchun, Jilin, 130022, China, People's Republic
Huamin Yang

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zhao, J., Wu, X., Jiang, D. (2012). Audio-Visual Emotion Recognition Based on Hidden Markov Model. In: Qian, Z., Cao, L., Su, W., Wang, T., Yang, H. (eds) Recent Advances in Computer Science and Information Engineering. Lecture Notes in Electrical Engineering, vol 129. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25778-0_14

Download citation

DOI: https://doi.org/10.1007/978-3-642-25778-0_14
Published: 05 February 2012
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25777-3
Online ISBN: 978-3-642-25778-0
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics