Speech Emotion Recognition Using Multiple Classifiers

Wang, Kunxia; Chu, Zongcheng; Wang, Kai; Yu, Tongqing; Liu, Li

doi:10.1007/978-3-319-69781-9_9

Kunxia Wang¹⁶,
Zongcheng Chu¹⁶,
Kai Wang¹⁶,
Tongqing Yu¹⁶ &
…
Li Liu¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10612))

Included in the following conference series:

Asia-Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint Conference on Web and Big Data

1121 Accesses

Abstract

The research topic of how to automatically identify the emotional state of speakers received much attention. In this paper, we mainly focus on speech emotion recognition and develop an audio-based classification framework for identifying five different emotions in our audio database where the audio segments are from Chinese TV plays. First, acoustic features were extracted from the audio segments using Wavelet analysis, then feature selection is implemented based on Information gain and Sequential Forward Selection in the purpose of reducing irrelevant information as well as dimension reduction. Our classification framework is constructed over three base classifiers: SVM, Adaboost and Randomforest. Considering of the fact that a single classifier is in the limitation of recognition capability, decision fusion methods are applied to aggregate different prediction labels. According to the experiment on our database, the fusion methods we proposed show better performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

El Ayadi, M., Kamel, M.S., Karray, F.: Survey on speech emotion recognition: features, classification schemes, and databases. Pattern Recogn. 44(3), 572–587 (2011)
Article MATH Google Scholar
Fragopanagos, N., Taylor, J.G.: Emotion recognition in human–computer interaction. Neural Netw. 18(4), 389–405 (2005)
Article Google Scholar
Liaw, A., Wiener, M.: Classification and regression by randomForest. R News 2(3), 18–22 (2002)
Google Scholar
Zhu, J., Zou, H., Rosset, S., et al.: Multi-class adaboost. Stat. Interface 2(3), 349–360 (2009)
Article MATH MathSciNet Google Scholar
Eyben, F., Weninger, F., Gross, F., Schuller, B.: Recent developments in opensmile, the Munich open-source multimedia feature extractor. In: Proceedings of the 21st ACM International Conference on Multimedia, pp. 835–838. ACM, October 2013
Google Scholar
Tufekci, Z., Gowdy, J.N.: Feature extraction using discrete wavelet transform for speech recognition. In: Proceedings of the IEEE Southeastcon 2000, pp. 116–123. IEEE (2000)
Google Scholar
Dharanipragada, S., Rao, B.D.: MVDR based feature extraction for robust speech recognition. In: Proceedings of 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2001), vol. 1, pp. 309–312. IEEE (2001)
Google Scholar
Suykens, J.A.K., Vandewalle, J.: Least squares support vector machine classifiers. Neural Process. Lett. 9(3), 293–300 (1999)
Article MATH Google Scholar
Svetnik, V., Liaw, A., Tong, C., et al.: Random forest: a classification and regression tool for compound classification and QSAR modeling. J. Chem. Inf. Comput. Sci. 43(6), 1947–1958 (2003)
Article Google Scholar
Bergstra, J., Casagrande, N., Erhan, D., et al.: Aggregate features and AdaBoost for music classification. Mach. Learn. 65(2–3), 473–484 (2006)
Article Google Scholar
Sun, B., Li, L., Wu, X., et al.: Combining feature-level and decision-level fusion in a hierarchical classifier for emotion recognition in the wild. J. Multimodal User Interfaces 10(2), 125–137 (2016)
Article Google Scholar
Kuncheva, L.I., Bezdek, J.C., Duin, R.P.W.: Decision templates for multiple classifier fusion: an experimental comparison. Pattern Recogn. 34(2), 299–314 (2001)
Article MATH Google Scholar
Moreno-Seco, F., Iñesta, J.M., de León, P.J.P., Micó, L.: Comparison of classifier fusion methods for classification in pattern recognition tasks. In: Yeung, D.-Y., Kwok, J.T., Fred, A., Roli, F., de Ridder, D. (eds.) SSPR /SPR 2006. LNCS, vol. 4109, pp. 705–713. Springer, Heidelberg (2006). https://doi.org/10.1007/11815921_77
Chapter Google Scholar
Wang, K.X., Zhang, Q.L., Liao, S.Y.: A database of elderly emotional speech. In: Proceedings of International Symposium on Signal Processing, Biomedical Engineering Information, pp. 549–553 (2014)
Google Scholar
Wang, K., An, N., Li, L.: Speech emotion recognition based on wavelet packet coefficient model. In: 2014 9th International Symposium on Chinese Spoken Language Processing (ISCSLP), pp. 478–482. IEEE, September 2014
Google Scholar
Daubechies, I., Sweldens, W.: Factoring wavelet transforms into lifting steps. J. Fourier Anal. Appl. 4(3), 247–269 (1998)
Article MATH MathSciNet Google Scholar
Ververidis, D., Kotropoulos, C.: Fast and accurate sequential floating forward feature selection with the Bayes classifier applied to speech emotion recognition. Signal Process. 88(12), 2956–2970 (2008)
Article MATH Google Scholar
Tao, Y., Wang, K., Yang, J., An, N., Li, L.: Harmony search for feature selection in speech emotion recognition. In: International Conference on Affective Computing and Intelligent Interaction (ACII), pp. 362–367. IEEE, September 2015
Google Scholar
Jin, Y., Song, P., Zheng, W., Zhao, L.: A feature selection and feature fusion combination method for speaker-independent speech emotion recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4808–4812. IEEE, May 2014
Google Scholar
K\(\ddot{a}\)chele, M., Zharkov, D., Meudt, S., Schwenker, F.: Prosodic, spectral and voice quality feature selection using a long-term stopping criterion for audio-based emotion recognition. In: 22nd International Conference on Pattern Recognition (ICPR), pp. 803–808. IEEE, August 2014
Google Scholar

Download references

Acknowledgements

This work was supported by the Open Project Program of the National Laboratory of Pattern Recognition (NLPR) (NO. 201700014), Anhui Provincial Natural Science Foundation (No. 1708085MF167), and Anhui Prov-ince Key Laboratory project of affective computing and advanced intelligent machines under grant ACAIM160103. Any correspondence should be made to Li Liu and Kunxia Wang.

Author information

Authors and Affiliations

Key Lab of Artificial Architecture, Anhui Jianzhu University, Hefei, China
Kunxia Wang, Zongcheng Chu, Kai Wang & Tongqing Yu
School of Software Engineering, Chongqing University, Chongqing, China
Li Liu

Authors

Kunxia Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zongcheng Chu
View author publications
You can also search for this author in PubMed Google Scholar
Kai Wang
View author publications
You can also search for this author in PubMed Google Scholar
Tongqing Yu
View author publications
You can also search for this author in PubMed Google Scholar
Li Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Kunxia Wang or Li Liu .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Shaoxu Song
George Mason University, Fairfax, Virginia, USA
Matthias Renz
Kangwon National University, Chuncheon, Korea (Republic of)
Yang-Sae Moon

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, K., Chu, Z., Wang, K., Yu, T., Liu, L. (2017). Speech Emotion Recognition Using Multiple Classifiers. In: Song, S., Renz, M., Moon, YS. (eds) Web and Big Data. APWeb-WAIM 2017. Lecture Notes in Computer Science(), vol 10612. Springer, Cham. https://doi.org/10.1007/978-3-319-69781-9_9

Download citation

DOI: https://doi.org/10.1007/978-3-319-69781-9_9
Published: 08 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69780-2
Online ISBN: 978-3-319-69781-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics