Ensemble Models for Enhancement of an Arabic Speech Emotion Recognition System

Zantout, Rached; Klaylat, Samira; Hamandi, Lama; Osman, Ziad

doi:10.1007/978-3-030-12385-7_15

Rached Zantout⁴,
Samira Klaylat⁵,
Lama Hamandi⁶ &
…
Ziad Osman⁵

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 70))

Included in the following conference series:

Future of Information and Communication Conference

1567 Accesses
2 Citations

Abstract

Ensemble classification model has been widely used in the area of machine learning to enhance the performance of single classifiers. In this paper, we study the effect of employing five ensemble models, namely Bagging, Adaboost, Logitboost, Random Subspace and Random Committee, on a vocal emotion recognition system. The system recognizes happy, angry, and surprise emotion from Arabic natural speech where the highest accuracy among single classifiers is obtained by SMO 95.52%. After applying the ensemble models on 19 single classifiers, the best enhanced accuracy is 95.95% achieved by SMO as well. The highest improvement in accuracy was 19.09%. It was achieved by the Boosting technique having the Naïve Bayes Multinomial as base classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Batliner, A., Schuller, B., Seppi, D., Steidl, S., Devillers, L., Vidrascu, L., Vogt, T., Aharonson, V., Amir, N.: The automatic recognition of emotions in speech. In: Petta, P., Pelachaud, C., Cowie, R. (eds.) Emotion-Oriented Systems, pp. 71–99. Springer, Berlin (2011)
Chapter Google Scholar
Valentini, G., Masulli, F.: Ensembles of learning machines. In: Marinaro, M., Tagliaferri, R. (eds.) WIRN 2002. LNCS, vol. 2486, pp. 3–20. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45808-5_1
Chapter MATH Google Scholar
Klaylat, S., Osman, Z., Zantout, R., Hamandi, L.: Emotion Recognition in Arabic Speech. Analog. Integr. Circuits Signal Process., Springer, 96(2), 337–351 (2018)
Article Google Scholar
Klaylat, S., Hamandi, L., Zantout, R., Osman, Z.: Arabic natural audio dataset. MendeleyData, v1,http://dx.doi.org/10.17632/xm232yxf7t.1, Mendeley Data Website
Melville, P., Shah, N., Mihalkova, L., Mooney, R.J.: Experiments on ensembles with missing and noisy data. In: International Workshop on Multiple Classifier Systems, pp. 293–302. Springer, Heidelberg (2004)
Chapter Google Scholar
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
MATH Google Scholar
Kam Ho, T.: The Random Subspace Method for constructing Decision Forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)
Article Google Scholar
Frank, E., Hall, M.A., Witten, I.H.: The WEKA workbench. Online appendix for “data mining: practical machine learning tools and techniques”. Morgan Kaufmann, 4th edn. (2016)
Google Scholar
Batliner, A., Steidl, S., Schuller, B., Seppi, D., Laskowski, K., Vogt, T., Devillers, L., Vidrascu, L., Amir, N., Kessous, L., Aharonson, V.: Combining efforts for improving automatic classification of emotional user states. In: Proceedings of 5th Slovenian and 1st International Language Technologies Conference, IS LTC, pp. 240–245. Ljubljana, Slovenia (2006)
Google Scholar
Fiscus, J.: A post-processing system to yield reduced word error rates: recognizer output voting error reduction (ROVER). In: proceedings of Automatic Speech Recognition and Understanding, ASRU, pp. 347–354. Santa Barbara, USA (1997)
Google Scholar
Batliner, A., Hacker, C., Steidl, S., Nöth, E., D’Arcy, S., Russell, M.J., Wong, M.: You stupid tin box-children interacting with the AIBO robot: a cross-linguistic emotional speech corpus. In: Proceedings of 4th Language Resources and Evaluation Conference, LREC, pp. 171–174. Lisbon, Portugal (2004)
Google Scholar
Iriondo, I., Planet, S., Socoró, J.C., Alías, F.: Objective and subjective evaluation of an expressive speech corpus. In: Proceedings of International Conference on Nonlinear Speech Processing, pp. 86–94. Springer, Berlin, Heidelberg (2007)
Google Scholar
Morrison, D., De Silva, L.C.: Voting ensembles for spoken affect classification. J. Netw. Comput. Appl. 30(4), 1356–1365 (2007)
Article Google Scholar
Morrison, D., Wang, R., De Silva, L.C.: Ensemble methods for spoken emotion recognition in call-centres. Speech Commun. 49(2), 98–112 (2007)
Article Google Scholar
Schuller, B., Lang, M., Rigoll, G.: Robust acoustic speech emotion recognition by ensembles of classifiers. In: Tagungsband Fortschritte der Akustik-DAGA# 05. München (2005)
Google Scholar
Schuller, B., Rigoll, G.: Timing levels in segment-based speech emotion recognition. In: Proceedings of International Conference on Spoken Language Processing ICSLP, Pittsburgh, USA (2006)
Google Scholar
Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W.F., Weiss, B.: A database of German emotional speech. In: Proceedings of 9th European Conference on Speech Communication and Technology, ISCA, pp. 1517–1520. Lisbon, Portugal (2005)
Google Scholar
Dastgheib, A., Ranjbar Pouya, O., Lithgow, B., Moussavi, Z.: Comparison of a new ad-hoc classification method with Support Vector Machine and Ensemble classifiers for the diagnosis of Meniere’s disease using EVestG signals. In: Proceedings of Electrical and Computer Engineering (CCECE), IEEE, pp. 1–4. Canada (2016)
Google Scholar
Dacheng, T., Xiaoou, T., Xuelong, L., Xindong, W.: Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 28(7), 1088–1099 (2006)
Article Google Scholar
Nanni, L., Lumini, A.: Random subspace for an improved BioHashing for face authentication. Pattern Recogn. Lett. 29(3), 295–300 (2008)
Article Google Scholar
Wang, X., Tang, X.: Random sampling LDA for face recognition. In: proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2004)
Google Scholar
Ali, R., Siddiqi, M.H., Idris, M., Kang, B.H., Lee, S.: Prediction of diabetes mellitus based on boosting ensemble modeling. In: proceedings of International conference on Ubiquitous Computing and Ambient Intelligence, pp. 25–28. Springer, Cham (2014)
Google Scholar
Thongkam, J., Xu, G., Zhang, Y., Huang, F.: Support Vector Machine for Outlier Detection in Breast Cancer Survivability Prediction. In: Ishikawa, Yoshiharu, He, J., Xu, G., Shi, Y., Huang, G., Pang, C., Zhang, Q., Wang, G. (eds.) APWeb 2008. LNCS, vol. 4977, pp. 99–109. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-89376-9_10
Chapter Google Scholar
Leard Statistics website: Kruskal-Wallis H Test using SPSS Statistics. https://statistics.laerd.com/spss-tutorials/kruskal-wallis-h-test-using-spss-statistics.php
Kuncheva, L.: Combining Pattern Classifiers: Methods and Algorithms. Wiley (2004)
Google Scholar
Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: The proceedings of 13th International Conference on Machine Learning, pp. 148–156. San Francisco (1996)
Google Scholar
Skurichina, M.: Stabilizing weak classifiers. Ph.D. thesis, Delft University of Technology, Delft, The Netherlands (2001)
Google Scholar
Skurichina, M., Duin, R.P.W.: Bagging, boosting and the random subspace method for linear classifiers. Pattern Anal. Appl. 5, 121–135 (2002)
Article MathSciNet Google Scholar
Friedman, J., Hastie, T., Tibshirani, R.: Additive Logistic Regression: A Statistical View of Boosting. Stanford University (1998)
Google Scholar
Niculescu-Mizil, A., Caruana, R.: An empirical comparison of supervised learning algorithms using different performance metrics. Technical report, Cornell University (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Rafik Hariri University, Almechref, Lebanon
Rached Zantout
Beirut Arab University, Beirut, Lebanon
Samira Klaylat & Ziad Osman
American University of Beirut, Beirut, Lebanon
Lama Hamandi

Authors

Rached Zantout
View author publications
You can also search for this author in PubMed Google Scholar
Samira Klaylat
View author publications
You can also search for this author in PubMed Google Scholar
Lama Hamandi
View author publications
You can also search for this author in PubMed Google Scholar
Ziad Osman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Samira Klaylat .

Editor information

Editors and Affiliations

Faculty of Science and Engineering, Saga University, Saga, Japan
Kohei Arai
The Science and Information (SAI) Organization, Bradford, UK
Rahul Bhatia

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zantout, R., Klaylat, S., Hamandi, L., Osman, Z. (2020). Ensemble Models for Enhancement of an Arabic Speech Emotion Recognition System. In: Arai, K., Bhatia, R. (eds) Advances in Information and Communication. FICC 2019. Lecture Notes in Networks and Systems, vol 70. Springer, Cham. https://doi.org/10.1007/978-3-030-12385-7_15

Download citation

DOI: https://doi.org/10.1007/978-3-030-12385-7_15
Published: 02 February 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-12384-0
Online ISBN: 978-3-030-12385-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics