International Journal of Speech Technology

, Volume 21, Issue 1, pp 167–183 | Cite as

Choice of a classifier, based on properties of a dataset: case study-speech emotion recognition

  • Shashidhar G. Koolagudi
  • Y. V. Srinivasa Murthy
  • Siva P. Bhaskar
Article
  • 52 Downloads

Abstract

In this paper, the process of selecting a classifier based on the properties of dataset is designed since it is very difficult to experiment the data on n—number of classifiers. As a case study speech emotion recognition is considered. Different combinations of spectral and prosodic features relevant to emotions are explored. The best subset of the chosen set of features is recommended for each of the classifiers based on the properties of chosen dataset. Various statistical tests have been used to estimate the properties of dataset. The nature of dataset gives an idea to select the relevant classifier. To make it more precise, three other clustering and classification techniques such as K-means clustering, vector quantization and artificial neural networks are used for experimentation and results are compared with the selected classifier. Prosodic features like pitch, intensity, jitter, shimmer, spectral features such as mel frequency cepstral coefficients (MFCCs) and formants are considered in this work. Statistical parameters of prosody such as minimum, maximum, mean (\(\mu\)) and standard deviation (\(\sigma\)) are extracted from speech and combined with basic spectral (MFCCs) features to get better performance. Five basic emotions namely anger, fear, happiness, neutral and sadness are considered. For analysing the performance of different datasets on different classifiers, content and speaker independent emotional data is used, collected from Telugu movies. Mean opinion score of fifty users is collected to label the emotional data. To make it more accurate, one of the benchmark IIT-Kharagpur emotional database is used to generalize the conclusions.

Keywords

Properties of dataset Normality tests Selection of classifier Spectral and prosodic features Jitter Shimmer 

References

  1. Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723.MathSciNetCrossRefMATHGoogle Scholar
  2. Anagnostopoulos, C.-N., Iliou, T., & Giannoukos, I. (2012). Features and classifiers for emotion recognition from speech: A survey from 2000 to 2011. Artificial Intelligence Review, 43, 1–23.Google Scholar
  3. Ananthapadmanabha, T., & Yegnanarayana, B. (1979). Epoch extraction from linear prediction residual for identification of closed glottis interval. IEEE Transactions on Acoustics, Speech and Signal Processing, 27(4), 309–319.CrossRefGoogle Scholar
  4. Banse, R., & Scherer, K. R. (1996). Acoustic profiles in vocal emotion expression. Journal of Personality and Social Psychology, 70(3), 614.CrossRefGoogle Scholar
  5. Bhatti, M. W., Wang, Y., & Guan, L. (2004). A neural network approach for human emotion recognition in speech. In ISCAS’04. Proceedings of the 2004 international symposium on Circuits and systems, 2004, (Vol. 2, pp II–181). IEEEGoogle Scholar
  6. Bishop, C. M., et al. (1995). Neural networks for pattern recognition. New York: Oxford University PressGoogle Scholar
  7. Bitouk, D., Verma, R., & Nenkova, A. (2010). Class-level spectral features for emotion recognition. Speech Communication, 52(7), 613–625.CrossRefGoogle Scholar
  8. Black, M. J., & Yacoob, Y. (1995). Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion. In Proceedings, fifth international conference on Computer vision, 1995, (pp. 374–381). IEEE.Google Scholar
  9. Bou-Ghazale, S. E., & Hansen, J. H. L. (2000). A comparative study of traditional and newly proposed features for recognition of speech under stress. IEEE Transactions on Speech and Audio Processing, 8(4), 429–442.CrossRefGoogle Scholar
  10. Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2), 121–167.CrossRefGoogle Scholar
  11. Busso, C., Deng, Z., Yildirim, S., Bulut, M., Lee, C.  M., Kazemzadeh, A., Lee, S., Neumann, U., & Narayanan, S. (2004). Analysis of emotion recognition using facial expressions, speech and multimodal information. In Proceedings of the 6th international conference on Multimodal interfaces (pp. 205–211). ACMGoogle Scholar
  12. Chakraborty, R., Pandharipande, M., & Kopparapu, S. K. (2016). Knowledge-based framework for intelligent emotion recognition in spontaneous speech. Procedia Computer Science, 96, 587–596.CrossRefGoogle Scholar
  13. Chauhan, A., Koolagudi, S. G., Kafley, S., & Rao, K. S. (2010). Emotion recognition using lp residual. In Students’ technology symposium (TechSym), 2010 IEEE (pp. 255–261). IEEE.Google Scholar
  14. Chavhan, Y., Dhore, M. L., & Yesaware, P. (2010). Speech emotion recognition using support vector machine. International Journal of Computer Applications, 1(20), 6–9.CrossRefGoogle Scholar
  15. Chen, C., You, M., Song, M., Bu, J., Liu, J. (2006). An enhanced speech emotion recognition system based on discourse information. In Computational Science–ICCS 2006 (pp. 449–456). New York: Springer (2006).Google Scholar
  16. Chung-Hsien, W., & Liang, W.-B. (2011). Emotion recognition of affective speech based on multiple classifiers using acoustic-prosodic information and semantic labels. IEEE Transactions on Affective Computing, 2(1), 10–21.CrossRefGoogle Scholar
  17. Cowie, R., & Cornelius, R. R. (2003). Describing the emotional states that are expressed in speech. Speech Communication, 40(1), 5–32.CrossRefMATHGoogle Scholar
  18. Dai, K., Fell, H. J., & MacAuslan, J. (2008). Recognizing emotion in speech using neural networks. Telehealth and Assistive Technologies, 31, 38–43.Google Scholar
  19. Deller, J. R. P., John G., & Hansen, J. H.L. (2000). Discrete-time processing of speech signals. New York: IEEE.Google Scholar
  20. Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7, 1–30.MathSciNetMATHGoogle Scholar
  21. Deng, J., Xinzhou, X., Zhang, Z., Frühholz, S., & Schuller, B. (2017). Universum autoencoder-based domain adaptation for speech emotion recognition. IEEE Signal Processing Letters, 24(4), 500–504.CrossRefGoogle Scholar
  22. Deng, J., Zhang, Z., Eyben, F., & Schuller, B. (2014). Autoencoder-based unsupervised domain adaptation for speech emotion recognition. IEEE Signal Processing Letters, 21(9), 1068–1072.CrossRefGoogle Scholar
  23. El Ayadi, M., Kamel, M. S., & Karray, F. (2011). Survey on speech emotion recognition: Features, classification schemes, and databases. Pattern Recognition, 44(3), 572–587.CrossRefMATHGoogle Scholar
  24. El-Yazeed, M. F., El Gamal, M. A., & El Ayadi, M. M. H. (2004). On the determination of optimal model order for gmm-based text-independent speaker identification. EURASIP Journal on Applied Signal Processing, 1078–1087, 2004.MATHGoogle Scholar
  25. Essa, I. A., & Pentland, A. P. (1997). Coding, analysis, interpretation, and recognition of facial expressions. IEEE transactions on Pattern analysis and machine intelligence, 19(7):757–763.Google Scholar
  26. Farrus, M., & Hernando, J. (2009). Using jitter and shimmer in speaker verification. IET Signal Processing, 3(4), 247–257.CrossRefGoogle Scholar
  27. Firoz, S.A., Raji, S.A., & Babu, A.P. (2009). Automatic emotion recognition from speech using artificial neural networks with gender-dependent databases. In ACT’09. International conference on Advances in computing, control, & telecommunication technologies, 2009, (pp. 162–164). IEEEGoogle Scholar
  28. Foo, S. W., & De Silva, L. C. (2003). Speech emotion recognition using hidden markov models. Speech Communication, 41(4), 603–623.CrossRefGoogle Scholar
  29. Fu, L., Mao, X., & Chen, L. (2008). Relative speech emotion recognition based artificial neural network. In Computational intelligence and industrial application, 2008. PACIIA’08. Pacific-Asia workshop on (Vol. 2, pp. 140–144). IEEEGoogle Scholar
  30. Giannoulis, Panagiotis, & Potamianos, Gerasimos (2012). A hierarchical approach with feature selection for emotion recognition from speech. In LREC (pp. 1203–1206)Google Scholar
  31. Grimm, M., Kroschel, K., & Narayanan, S. (2007). Support vector regression for automatic recognition of spontaneous emotions in speech. In IEEE international conference on acoustics, speech and signal processing, 2007. ICASSP 2007, (vol. 4, pp. IV–1085). IEEEGoogle Scholar
  32. Han, J., & Kamber, M. (2006). Data Mining. Southeast Asia Edition: Concepts and Techniques. Morgan kaufmann.MATHGoogle Scholar
  33. Han, K., Yu, D., & Tashev, I. (2014). Speech emotion recognition using deep neural network and extreme learning machine. In Fifteenth annual conference of the international speech communication association.Google Scholar
  34. Hernando, J., Nadeu, C., & Mariño, J. B. (1997). Speech recognition in a noisy car environment based on lp of the one-sided autocorrelation sequence and robust similarity measuring techniques. Speech Communication, 21(1), 17–31.CrossRefGoogle Scholar
  35. Hess, W. J. (2008). Pitch and voicing determination of speech with an extension toward music signals. In Springer Handbook of Speech Processing, (pp. 181–212). Berlin: Springer.Google Scholar
  36. Heuft, B., Portele, T., & Rauth, M. (1996). Emotions in time domain synthesis. In Proceedings, fourth international conference on Spoken Language, 1996. ICSLP 96, (Vol. 3, pp. 1974–1977). IEEEGoogle Scholar
  37. Huang, J., Yang, W., & Zhou, D. (2012). Variance-based gaussian kernel fuzzy vector quantization for emotion recognition with short speech. In 2012 IEEE 12th international conference on Computer and information technology (CIT), (pp. 557–560). IEEE.Google Scholar
  38. Iida, A., Campbell, N., Higuchi, F., & Yasumura, M. (2003). A corpus-based speech synthesis system with emotion. Speech Communication, 40(1), 161–187.CrossRefMATHGoogle Scholar
  39. Iida, A., Campbell, N., Iga, S., Higuchi, F., Yasumura, M. (2000). A speech synthesis system with emotion for assisting communication. In ISCA tutorial and research workshop (ITRW) on speech and emotion.Google Scholar
  40. Ingale, A. B., & Chaudhari, D. S. (2012). Speech emotion recognition. International Journal of Soft Computing and Engineering (IJSCE), 2(1), 235–238.Google Scholar
  41. Jawarkar, N. P., et al. (2007). Emotion recognition using prosody features and a fuzzy min-max neural classifier. The Institution of Electronics and Telecommunication Engineers, 24(5), 369–373.Google Scholar
  42. Kaiser, L. (1962). Communication of affects by single vowels. Synthese, 14(4), 300–319.CrossRefGoogle Scholar
  43. Kenji, M. A. S. E. (1991). Recognition of facial expression from optical flow. IEICE TRANSACTIONS on Information and Systems, 74(10), 3474–3483.Google Scholar
  44. Khanchandani, K. B., & Hussain, M. A. (2009). Emotion recognition using multilayer perceptron and generalized feed forward neural network. Journal of Scientific and Industrial Research, 68(5), 367.Google Scholar
  45. Khanna, P., & Kumar, M. S. (2011). Application of vector quantization in emotion recognition from human speech. In Information intelligence, systems, technology and management (pp. 118–125). New York: Springer.Google Scholar
  46. Kohavi, R., et al. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. Ijcai, 14, 1137–1145.Google Scholar
  47. Konar, A., & Chakraborty, A. (2014). Emotion recognition: A pattern analysis approach. Wiley: Hobroken, NJ.Google Scholar
  48. Koolagudi, S. G., & Rao, K. S. (2012). Emotion recognition from speech: A review. International Journal of Speech Technology, 15(2), 99–117.CrossRefGoogle Scholar
  49. Koolagudi, S. G., Maity, S., Kumar, V. A., Chakrabarti, S., & Rao, K. S. (2009). Iitkgp-sesc: Speech database for emotion analysis. In International conference on contemporary computing (pp. 485–492). New York: SpringerGoogle Scholar
  50. Koolagudi, S. G., Nandy, S., & Rao, K. S. (2009). Spectral features for emotion classification. In Advance computing conference, 2009. IACC 2009. IEEE International (pp. 1292–1296). IEEEGoogle Scholar
  51. Koolagudi, S. G., Reddy, R., & Rao, K. S. (2010). Emotion recognition from speech signal using epoch parameters. In 2010 international conference on Signal processing and communications (SPCOM), (pp. 1–5). IEEE.Google Scholar
  52. Kostoulas, T.P., & Fakotakis, N. (2006). A speaker dependent emotion recognition framework. In Proceedings 5th international symposium, communication systems, networks and digital signal processing (CSNDSP), University of Patras (pp. 305–309)Google Scholar
  53. Krothapalli, S. R., & Koolagudi, S. G. (2013). Speech emotion recognition: A review. In Emotion recognition using speech features, pp. 15–34. New York: Springer.Google Scholar
  54. Kwon, O.-W., Chan, K., Hao, J., & Lee, T.-W. (2003). Emotion recognition by speech signals. In INTERSPEECH.Google Scholar
  55. Le Bouquin, R. (1996). Enhancement of noisy speech signals: Application to mobile radio communications. Speech Communication, 18(1), 3–19.CrossRefGoogle Scholar
  56. Lee, K.-F., & Hon, H.-W. (1989). Speaker-independent phone recognition using hidden markov models. IEEE transactions on acoustics, speech and signal processing , 37(11), 1641–1648.CrossRefGoogle Scholar
  57. Lee, C. M., Yildirim, S., Bulut, M., Kazemzadeh, A., Busso, C., Deng, Z., Lee, S., & Narayanan, S. (2004). Emotion recognition based on phoneme classes. In INTERSPEECH (pp. 205–211).Google Scholar
  58. Li, Y., & Zhao, Y. (1998). Recognizing emotions in speech using short-term and long-term features. In ICSLP.Google Scholar
  59. Li, J. Q. & Barron, A. R. (1999). Mixture density estimation. In Advances in neural information processing systems 12. Citeseer.Google Scholar
  60. Li, X., Tao, J., Johnson, M. T., Soltis, J., Savage, A., Leong, K. M., & Newman, J. D. (2007). Stress and emotion classification using jitter and shimmer features. In Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on (Vol. 4, pp. IV–1081). IEEE.Google Scholar
  61. Lilliefors, H. W. (1967). On the kolmogorov-smirnov test for normality with mean and variance unknown. Journal of the American Statistical Association, 62(318), 399–402.CrossRefGoogle Scholar
  62. Lin, Y.-L., & Wei, G. (2005). Speech emotion recognition based on hmm and svm. In Proceedings of 2005 international conference on Machine learning and cybernetics, 2005, (Vol. 8, pp. 4898–4901). IEEE.Google Scholar
  63. Linde, Y., Buzo, A., & Gray, R. M. (1980). An algorithm for vector quantizer design. IEEE transactions on Communications, 28(1):84–95Google Scholar
  64. Liu, H., & Lei, Y. (2005). Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on knowledge and data engineering, 17(4), 491–502.MathSciNetCrossRefGoogle Scholar
  65. Luengo, I., Navas, E., Hernáez, I., Sánchez, J. (2005). Automatic emotion recognition using prosodic parameters. In INTERSPEECH (pp. 493–496).Google Scholar
  66. Mardia., K. V. (1970). Measures of multivariate skewness and kurtosis with applications. Biometrika, 57(3), 519–530.MathSciNetCrossRefMATHGoogle Scholar
  67. Motamed, S., Setayeshi, S., & Rabiee, A. (2017). Speech emotion recognition based on a modified brain emotional learning model. Biologically Inspired Cognitive Architectures, 19, 32–38.CrossRefGoogle Scholar
  68. Muslea, I., Minton, S., & Knoblock, C. A. (2006). Active learning with multiple views. Journal of Artificial Intelligence Research, 27, 203–233.MathSciNetMATHGoogle Scholar
  69. Muthusamy, H., Polat, K., & Yaacob, S. (2015). Improved emotion recognition using gaussian mixture model and extreme learning machine in speech and glottal signals. Mathematical Problems in Engineering, 2015.Google Scholar
  70. Neiberg, D., Elenius, K., & Laskowski, K. (2006). Emotion recognition in spontaneous speech using gmms. In INTERSPEECH.Google Scholar
  71. Nicholson, J., Takahashi, K., & Nakatsu, R. (2000). Emotion recognition in speech using neural networks. Neural Computing & Applications, 9(4), 290–296.CrossRefMATHGoogle Scholar
  72. Nogueiras, A., Moreno, A., Bonafonte, A., & Mariño, J. B. (2001). Speech emotion recognition using hidden markov models. In INTERSPEECH (pp. 2679–2682).Google Scholar
  73. Nooteboom, S. (1997). The prosody of speech: Melody and rhythm. The Handbook of Phonetic Sciences, 5, 640–673.Google Scholar
  74. Nwe, T. L., Wei, F. S., & De Silva, L. C. (2001). Speech based emotion classification. In TENCON 2001, Proceedings of IEEE region 10 international conference on electrical and electronic technology, IEEE, (Vol. 1, pp. 297–301).Google Scholar
  75. Ortony, A. (1990). The cognitive structure of emotions. Cambridge: Cambridge University Press.Google Scholar
  76. Pan, Y., Shen, P., & Shen, L. (2012). Speech emotion recognition using support vector machine. International Journal of Smart Home, 6(2), 101–107.Google Scholar
  77. Partila, P., & Voznak, M. (2013). Speech emotions recognition using 2-d neural classifier. In Nostradamus 2013: Prediction, modeling and analysis of complex systems (pp. 221–231). New York: Springer.Google Scholar
  78. Petrushin, V. A. (2000). Emotion recognition in speech signal: experimental study, development, and application. Studies, 3, 4.Google Scholar
  79. Polzin, T. S. & Waibel, A. (1998). Detecting emotions in speech. In Proceedings of the CMC (Vol. 16). CiteseerGoogle Scholar
  80. Rabiner, L. (1989). A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), 257–286.CrossRefGoogle Scholar
  81. Rabiner, L. R., & Schafer, R. W. (1978). Digital processing of speech signals (Vol. 100). Englewood Cliffs: Prentice-hall.Google Scholar
  82. Rabiner, L. R., & Juang, B.-H. (1993). In Fundamentals of speech recognition (Vol. 14). Englewood Cliffs: PTR Prentice Hall .Google Scholar
  83. Rao, S. K., Koolagudi, S. G., & Vempada, R. R. (2013). Emotion recognition from speech using global and local prosodic features. International Journal of Speech Technology, 16(2), 143–160.CrossRefGoogle Scholar
  84. Rao, K. S., & Koolagudi, S. G. (2012). Emotion recognition using speech features. New York: Springer Science & Business Media.Google Scholar
  85. Rao, K. S., Reddy, R., Maity, S., & Koolagudi, S. G. (2010). Characterization of emotions using the dynamics of prosodic. In Proceedings of speech prosody (Vol. 4).Google Scholar
  86. Razak, A. A., Komiya, R., Izani, M., & Abidin, Z. (2005). Comparison between fuzzy and nn method for speech emotion recognition. In ICITA 2005. Third international conference on Information technology and applications, 2005, (Vol. 1, pp. 297–302). IEEEGoogle Scholar
  87. Reddy, S. Arundathy, Singh, Amarjot, Kumar, N. Sumanth, & Sruthi, K.S. (2011). The decisive emotion identifier. In 2011 3rd international conference on electronics computer technology (ICECT), (Vol. 2, pp. 28–32). IEEE.Google Scholar
  88. Rencher, A. C., & Christensen, W. F. (2012). Methods of multivariate analysis (Vol. 709). New York: Wiley.Google Scholar
  89. Reynolds, D. A., Quatieri, T. F., & Dunn, R. B. (2000). Speaker verification using adapted gaussian mixture models. Digital Signal Processing, 10(1), 19–41.CrossRefGoogle Scholar
  90. Reynolds, D. A., & Rose, R. C. (1995). Robust text-independent speaker identification using gaussian mixture speaker models. IEEE Transactions on speech and audio processing, 3(1), 72–83.CrossRefGoogle Scholar
  91. Rissanen, J. (1978). Modeling by shortest data description. Automatica, 14(5), 465–471.CrossRefMATHGoogle Scholar
  92. Rojas, R. (2013). Neural networks: A systematic introduction. Berlin: Springer Science & Business Media.MATHGoogle Scholar
  93. Sato, N., & Obuchi, Y. (2007). Emotion recognition using mel-frequency cepstral coefficients. Information and Media Technologies, 2(3), 835–848.Google Scholar
  94. Scherer, K. R. (2003). Vocal communication of emotion: A review of research paradigms. Speech Communication, 40(1), 227–256.CrossRefMATHGoogle Scholar
  95. Scherer, K. R. (1989). Vocal correlates of emotional arousal and affective disturbance. In Handbook of social psychophysiology (pp. 165–197).Google Scholar
  96. Schuller, B., Müller, R., Lang, M., & Rigoll, G. (2005). Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles. In Ninth European Conference on Speech Communication and Technology.Google Scholar
  97. Schuller, B., Rigoll, G., & Lang, M. (2003). Hidden markov model-based speech emotion recognition. In Proceedings. (ICASSP’03). 2003 IEEE international conference on acoustics, speech, and signal processing, 2003, (Vol. 2, pp. II–1). IEEE.Google Scholar
  98. Schuller, B., Rigoll, G., & Lang, M. (2004). Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine-belief network architecture. In Proceedings (ICASSP’04). IEEE international conference on acoustics, speech, and signal processing, 2004, (Vol. 1, pp. I–577). IEEE.Google Scholar
  99. Seehapoch, T., & Wongthanavasu, S. (2013). Speech emotion recognition using support vector machines. In 2013 5th international conference on Knowledge and smart technology (KST) (pp. 86–91). IEEE.Google Scholar
  100. Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52(3/4), 591–611.MathSciNetCrossRefMATHGoogle Scholar
  101. Shen, P., Changjun, Z., & Chen, X. (2011). Automatic speech emotion recognition using support vector machine. In 2011 International conference on electronic and mechanical engineering and information technology (EMEIT), (Vol. 2, pp. 621–625). IEEE.Google Scholar
  102. Siqing, W., Falk, T. H., & Chan, W.-Y. (2011). Automatic speech emotion recognition using modulation spectral features. Speech Communication, 53(5), 768–785.CrossRefGoogle Scholar
  103. Soares, C., & Brazdil, P. B. (2000). Zoomed ranking: Selection of classification algorithms based on relevant performance information. In European conference on principles of data mining and knowledge discovery, (pp. 126–135). New York: SpringerGoogle Scholar
  104. Song, P., Jin, Y., Zhao, L., & Xin, M. (2014). Speech emotion recognition using transfer learning. IEICE TRANSACTIONS on Information and Systems, 97(9), 2530–2532.CrossRefGoogle Scholar
  105. Song, P., Zheng, W., Shifeng, O., Zhang, X., Jin, Y., Liu, J., et al. (2016). Cross-corpus speech emotion recognition based on transfer non-negative matrix factorization. Speech Communication, 83, 34–41.CrossRefGoogle Scholar
  106. Soong, F. K., Rosenberg, A. E., Juang, B.-H., & Rabiner, L. R. (1987). Report: A vector quantization approach to speaker recognition. AT&T Technical Journal, 66(2):14–26Google Scholar
  107. Stuhlsatz, A., Meyer, C., Eyben, F., ZieIke, T., Meier, G., & Schuller, B. (2011). Deep neural networks for acoustic emotion recognition: Raising the benchmarks. In 2011 IEEE international conference on acoustics, speech and signal processing (ICASSP), (pp. 5688–5691). IEEEGoogle Scholar
  108. Takahashi, K. (2004). Remarks on svm-based emotion recognition from multi-modal bio-potential signals. In ROMAN 2004. 13th IEEE international workshop on Robot and human interactive communication, 2004, (pp. 95–100). IEEE.Google Scholar
  109. Tang, J., Alelyani, S., & Liu, H. (2014). Feature selection for classification: A review. Data classification: Algorithms and applications, p. 37Google Scholar
  110. Tang, H., Chu, S. M., Hasegawa-Johnson, M., & Huang, T.  S. (2009). Emotion recognition from speech via boosted gaussian mixture models. In IEEE international conference on Multimedia and expo, 2009. ICME 2009, (pp. 294–297). IEEE.Google Scholar
  111. Tian, Y., Kanade, T., & Cohn, J. F. (2000). Recognizing lower face action units for facial expression analysis. In Proceedings, fourth IEEE international conference on Automatic face and gesture recognition, 2000, (pp. 484–490). IEEE.Google Scholar
  112. Traunmüller, H., & Eriksson, A. (1995). The frequency range of the voice fundamental in the speech of male and female adults. Unpublished Manuscript Google Scholar
  113. Trigeorgis, G., Ringeval, F., Brueckner, R., Marchi, E., Nicolaou, M. A., Schuller, B., & Zafeiriou, S. (2016). Adieu features? end-to-end speech emotion recognition using a deep convolutional recurrent network. In 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP), (pp. 5200–5204). IEEEGoogle Scholar
  114. Ververidis, D., & Kotropoulos, C. (2006). Emotional speech recognition: Resources, features, and methods. Speech Communication, 48(9), 1162–1181.CrossRefGoogle Scholar
  115. Ververidis, D., & Kotropoulos, C. (2005). Emotional speech classification using gaussian mixture models. In IEEE international symposium on circuits and systems, 2005. ISCAS 2005, (pp. 2871–2874). IEEE.Google Scholar
  116. Vlassis, N., & Likas, A. (1999). A kurtosis-based dynamic approach to gaussian mixture modeling. IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems and Humans, 29(4), 393–399.CrossRefGoogle Scholar
  117. Vlassis, N., & Likas, A. (2002). A greedy em algorithm for gaussian mixture learning. Neural Processing Letters, 15(1), 77–87.CrossRefMATHGoogle Scholar
  118. Vogt, T., André, E., & Bee, N. (2008). EmoVoice—A framework for online recognition of emotions from voice. In Perception in multimodal dialogue systems (pp. 188–199). Springer.Google Scholar
  119. Wang, K., An, N., Li, B. N., Zhang, Y., & Li, L. (2015). Speech emotion recognition using fourier parameters. IEEE Transactions on Affective Computing, 6(1), 69–75.CrossRefGoogle Scholar
  120. Wang, L. (2005). Support vector machines: Theory and applications, (Vol. 177). Springer Science & Business Media.Google Scholar
  121. Wenjing, H., Haifeng, L., & Chunyu, G. (2009). A hybrid speech emotion perception method of vq-based feature processing and ann recognition. In WRI global congress on Intelligent systems, 2009. GCIS’09, (Vol. 2, pp. 145–149). IEEE.Google Scholar
  122. Womack, B. D., & Hansen, J. H. L. (1999). N-channel hidden markov models for combined stressed speech classification and recognition. IEEE Transactions on Speech and Audio Processing, 7(6), 668–677.CrossRefGoogle Scholar
  123. Wu, S., Falk, T. H., & Chan, W. Y. (2011). Automatic speech emotion recognition using modulation spectral features. Speech communication, 53(5), 768–785.CrossRefGoogle Scholar
  124. Xiong, H., Junjie, W., & Chen, J. (2009). K-means clustering versus validation measures: A data-distribution perspective. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 39(2), 318–331.CrossRefGoogle Scholar
  125. Yacoob, Y., & Davis, L. (1994). Computing spatio-temporal representations of human faces. In 1994 IEEE computer society conference on Computer vision and pattern recognition, 1994. Proceedings CVPR’94, (pp. 70–75). IEEEGoogle Scholar
  126. Yamada, T., Hashimoto, H., & Tosa, N. (1995). Pattern recognition of emotion with neural network. In Proceedings of the 1995 IEEE IECON 21st international conference on Industrial electronics, control, and instrumentation, 1995, (Vol. 1, pp. 183–187). IEEEGoogle Scholar
  127. Yang, B., & Lugger, M. (2010). Emotion recognition from speech signals using new harmony features. Signal Processing, 90(5), 1415–1423.CrossRefMATHGoogle Scholar
  128. Yegnanarayana, B. (1994). Artificial neural networks for pattern recognition. Sadhana, 19(2), 189–238.MathSciNetCrossRefGoogle Scholar
  129. Yu, C., Tian, Q., Cheng, F., & Zhang, S. (2011). Speech emotion recognition using support vector machines. In Advanced research on computer science and information engineering (pp. 215–220). New York: SpringerGoogle Scholar
  130. Zheng, W., Xin, M., Wang, X., & Wang, B. (2014). A novel speech emotion recognition method via incomplete sparse least square regression. IEEE Signal Processing Letters, 21(5), 569–572.CrossRefGoogle Scholar
  131. Zhou, Y., Sun, Y., Zhang, J., & Yan, Y. (2009). Speech emotion recognition using both spectral and prosodic features. In ICIECS 2009. International conference on Information engineering and computer science, 2009, (pp. 1–4). IEEE.Google Scholar
  132. Zhou, J., Wang, G., Yang, Y., & Chen, P. (2006). Speech emotion recognition based on rough set and svm. In 5th IEEE international conference on Cognitive informatics, 2006. ICCI 2006, (Vol. 1, pp. 53–61). IEEE.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of CSENational Institute of Technology KarnatakaMangaloreIndia

Personalised recommendations