Abstract
Ensemble classification model has been widely used in the area of machine learning to enhance the performance of single classifiers. In this paper, we study the effect of employing five ensemble models, namely Bagging, Adaboost, Logitboost, Random Subspace and Random Committee, on a vocal emotion recognition system. The system recognizes happy, angry, and surprise emotion from Arabic natural speech where the highest accuracy among single classifiers is obtained by SMO 95.52%. After applying the ensemble models on 19 single classifiers, the best enhanced accuracy is 95.95% achieved by SMO as well. The highest improvement in accuracy was 19.09%. It was achieved by the Boosting technique having the Naïve Bayes Multinomial as base classifier.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Batliner, A., Schuller, B., Seppi, D., Steidl, S., Devillers, L., Vidrascu, L., Vogt, T., Aharonson, V., Amir, N.: The automatic recognition of emotions in speech. In: Petta, P., Pelachaud, C., Cowie, R. (eds.) Emotion-Oriented Systems, pp. 71–99. Springer, Berlin (2011)
Valentini, G., Masulli, F.: Ensembles of learning machines. In: Marinaro, M., Tagliaferri, R. (eds.) WIRN 2002. LNCS, vol. 2486, pp. 3–20. Springer, Heidelberg (2002). https://doi.org/10.1007/3-540-45808-5_1
Klaylat, S., Osman, Z., Zantout, R., Hamandi, L.: Emotion Recognition in Arabic Speech. Analog. Integr. Circuits Signal Process., Springer, 96(2), 337–351 (2018)
Klaylat, S., Hamandi, L., Zantout, R., Osman, Z.: Arabic natural audio dataset. MendeleyData, v1,http://dx.doi.org/10.17632/xm232yxf7t.1, Mendeley Data Website
Melville, P., Shah, N., Mihalkova, L., Mooney, R.J.: Experiments on ensembles with missing and noisy data. In: International Workshop on Multiple Classifier Systems, pp. 293–302. Springer, Heidelberg (2004)
Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)
Kam Ho, T.: The Random Subspace Method for constructing Decision Forests. IEEE Trans. Pattern Anal. Mach. Intell. 20(8), 832–844 (1998)
Frank, E., Hall, M.A., Witten, I.H.: The WEKA workbench. Online appendix for “data mining: practical machine learning tools and techniques”. Morgan Kaufmann, 4th edn. (2016)
Batliner, A., Steidl, S., Schuller, B., Seppi, D., Laskowski, K., Vogt, T., Devillers, L., Vidrascu, L., Amir, N., Kessous, L., Aharonson, V.: Combining efforts for improving automatic classification of emotional user states. In: Proceedings of 5th Slovenian and 1st International Language Technologies Conference, IS LTC, pp. 240–245. Ljubljana, Slovenia (2006)
Fiscus, J.: A post-processing system to yield reduced word error rates: recognizer output voting error reduction (ROVER). In: proceedings of Automatic Speech Recognition and Understanding, ASRU, pp. 347–354. Santa Barbara, USA (1997)
Batliner, A., Hacker, C., Steidl, S., Nöth, E., D’Arcy, S., Russell, M.J., Wong, M.: You stupid tin box-children interacting with the AIBO robot: a cross-linguistic emotional speech corpus. In: Proceedings of 4th Language Resources and Evaluation Conference, LREC, pp. 171–174. Lisbon, Portugal (2004)
Iriondo, I., Planet, S., Socoró, J.C., Alías, F.: Objective and subjective evaluation of an expressive speech corpus. In: Proceedings of International Conference on Nonlinear Speech Processing, pp. 86–94. Springer, Berlin, Heidelberg (2007)
Morrison, D., De Silva, L.C.: Voting ensembles for spoken affect classification. J. Netw. Comput. Appl. 30(4), 1356–1365 (2007)
Morrison, D., Wang, R., De Silva, L.C.: Ensemble methods for spoken emotion recognition in call-centres. Speech Commun. 49(2), 98–112 (2007)
Schuller, B., Lang, M., Rigoll, G.: Robust acoustic speech emotion recognition by ensembles of classifiers. In: Tagungsband Fortschritte der Akustik-DAGA# 05. München (2005)
Schuller, B., Rigoll, G.: Timing levels in segment-based speech emotion recognition. In: Proceedings of International Conference on Spoken Language Processing ICSLP, Pittsburgh, USA (2006)
Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W.F., Weiss, B.: A database of German emotional speech. In: Proceedings of 9th European Conference on Speech Communication and Technology, ISCA, pp. 1517–1520. Lisbon, Portugal (2005)
Dastgheib, A., Ranjbar Pouya, O., Lithgow, B., Moussavi, Z.: Comparison of a new ad-hoc classification method with Support Vector Machine and Ensemble classifiers for the diagnosis of Meniere’s disease using EVestG signals. In: Proceedings of Electrical and Computer Engineering (CCECE), IEEE, pp. 1–4. Canada (2016)
Dacheng, T., Xiaoou, T., Xuelong, L., Xindong, W.: Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 28(7), 1088–1099 (2006)
Nanni, L., Lumini, A.: Random subspace for an improved BioHashing for face authentication. Pattern Recogn. Lett. 29(3), 295–300 (2008)
Wang, X., Tang, X.: Random sampling LDA for face recognition. In: proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2004)
Ali, R., Siddiqi, M.H., Idris, M., Kang, B.H., Lee, S.: Prediction of diabetes mellitus based on boosting ensemble modeling. In: proceedings of International conference on Ubiquitous Computing and Ambient Intelligence, pp. 25–28. Springer, Cham (2014)
Thongkam, J., Xu, G., Zhang, Y., Huang, F.: Support Vector Machine for Outlier Detection in Breast Cancer Survivability Prediction. In: Ishikawa, Yoshiharu, He, J., Xu, G., Shi, Y., Huang, G., Pang, C., Zhang, Q., Wang, G. (eds.) APWeb 2008. LNCS, vol. 4977, pp. 99–109. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-89376-9_10
Leard Statistics website: Kruskal-Wallis H Test using SPSS Statistics. https://statistics.laerd.com/spss-tutorials/kruskal-wallis-h-test-using-spss-statistics.php
Kuncheva, L.: Combining Pattern Classifiers: Methods and Algorithms. Wiley (2004)
Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: The proceedings of 13th International Conference on Machine Learning, pp. 148–156. San Francisco (1996)
Skurichina, M.: Stabilizing weak classifiers. Ph.D. thesis, Delft University of Technology, Delft, The Netherlands (2001)
Skurichina, M., Duin, R.P.W.: Bagging, boosting and the random subspace method for linear classifiers. Pattern Anal. Appl. 5, 121–135 (2002)
Friedman, J., Hastie, T., Tibshirani, R.: Additive Logistic Regression: A Statistical View of Boosting. Stanford University (1998)
Niculescu-Mizil, A., Caruana, R.: An empirical comparison of supervised learning algorithms using different performance metrics. Technical report, Cornell University (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Zantout, R., Klaylat, S., Hamandi, L., Osman, Z. (2020). Ensemble Models for Enhancement of an Arabic Speech Emotion Recognition System. In: Arai, K., Bhatia, R. (eds) Advances in Information and Communication. FICC 2019. Lecture Notes in Networks and Systems, vol 70. Springer, Cham. https://doi.org/10.1007/978-3-030-12385-7_15
Download citation
DOI: https://doi.org/10.1007/978-3-030-12385-7_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-12384-0
Online ISBN: 978-3-030-12385-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)