Emotion Speech Recognition Based on Adaptive Fractional Deep Belief Network and Reinforcement Learning

Sangeetha, J.; Jayasankar, T.

doi:10.1007/978-981-13-0617-4_16

J. Sangeetha¹⁸ &
T. Jayasankar¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 768))

984 Accesses
5 Citations

Abstract

The identification of emotion is a challenging task due to the rapid development of human–computer interaction framework. Speech Emotion Recognition (SER) can be characterized as the extraction of the emotional condition of the narrator from their spoken utterances. The detection of emotion is troublesome to the computer since it differs according to the speaker. To solve this setback, the system is implemented based on Adaptive Fractional Deep Belief Network (AFDBN) and Reinforcement Learning (RL). Pitch chroma, spectral flux, tonal power ratio and MFCC features are extracted from the speech signal to achieve the desired task. The extracted feature is then given into the classification task. Finally, the performance is analyzed by the evaluation metrics which is compared with the existing systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Mencattini, A., Martinelli, E., Costantini, G., Todisco, M., Basile, B., Bozzali, M., Di Natale, C.: Speech emotion recognition using amplitude modulation parameters. Knowl.-Based Syst. 63, 68–81 (2014)
Google Scholar
Omar, M.K.: A factor analysis model of sequences for language recognition. In: Spoken Language Technology Workshop (SLT), pp. 341–347. IEEE, California (2016)
Google Scholar
Lu, C.-X., Sun, Z.-Y., Shi, Z.-Z., Cao, B.-X.: Using emotions as intrinsic motivation to accelerate classic reinforcement learning. In: International Conference on Information System and Artificial Intelligence (ISAI), pp. 332–337. IEEE, China (2016)
Google Scholar
Newland, E.J., Xu, S., Miranker, W.L.: A neural network-based approach to modeling the allocation of behaviors in concurrent schedule, variable interval learning. In: Fourth International Conference on Natural Computation, ICNC’08, vol. 2, pp. 245–249. IEEE, China (2008)
Google Scholar
Wang, K., An, N., Li, B.N., Zhang, Y., Li, L.: Speech emotion recognition using Fourier parameters. IEEE Trans. Affect. Comput. 6(1), 69–75 (2015)
Article Google Scholar
Jang, E.-H., Park, B.-J., Kim, S.-H., Chung, M.-A., Park, M.-S., Sohn, J.-H.: Emotion classification based on bio-signals emotion recognition using machine learning algorithms. In: International Conference on Information Science, Electronics and Electrical Engineering (ISEEE), vol. 3, pp. 1373–1376. IEEE, Japan (2014)
Google Scholar
Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Google Scholar
Ghahabi, O., Hernando, J.: Deep learning backend for single and multisession i-vector speaker recognition. J. IEEE/ACM Trans. Audio Speech Lang. Process. 25(4), 807–817 (2017)
Google Scholar
Cruz, F., Twiefel, J., Magg, S., Weber, C., Wermter, S.: Interactive reinforcement learning through speech guidance in a domestic scenario. In: IEEE International Joint Conference on Neural Networks (IJCNN), pp. 1341–1348, Killarney, Ireland (2015)
Google Scholar
Kim, E.H., Hyun, K.H., Kim, S.H., Kwak, Y.K.: Improved emotion recognition with a novel speaker-independent feature. IEEE/ASME Trans. Mechatron. 14(3), 317–325 (2009)
Article Google Scholar
Mao, Q., Dong, M., Huang, Z., Zhan, Y.: Learning salient features for speech emotion recognition using convolutional neural networks. IEEE Trans. Multimedia 16(8), 2203–2213 (2014)
Article Google Scholar
Hoque, S., Salauddin, F., Rahman, A.: Neighbour cell list optimization based on cooperative q-learning and reinforced back-propagation technique. In: Radio Science Meeting (Joint with AP-S Symposium), 2015 USNC-URSI, pp. 215–215. IEEE, Canada (2015)
Google Scholar
Gharsellaoui, S., Selouani, S.-A., Dahmane, A.O.: Automatic emotion recognition using auditory and prosodic indicative features. In: 2015 IEEE 28th Canadian Conference on, Electrical and Computer Engineering (CCECE), pp. 1265–1270. IEEE, Canada (2015)
Google Scholar
Lerch, A.: An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics, p. 272. Wiley IEEE Press, July 2012
Google Scholar
Peeters, G.: Chroma-based estimation of musical key from audio-signal analysis. In: Proceedings of the 7th International Conference on Music Information Retrieval. Victoria (BC), Canada (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Technology/SOC, SASTRA Deemed University, Thanjavur, Tamil Nadu, India
J. Sangeetha
Department of ECE, University College of Engineering, BIT Campus, Anna University, Tiruchirappalli, Tamil Nadu, India
T. Jayasankar

Authors

J. Sangeetha
View author publications
You can also search for this author in PubMed Google Scholar
T. Jayasankar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to J. Sangeetha .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Vignana Bharathi Institute of Technology, Hyderabad, Telangana, India
Pradeep Kumar Mallick
Faculty of Engineering, Aurel Vlaicu University of Arad, Arad, Romania
Valentina Emilia Balas
Department of Electrical and Electronics Engineering, Sikkim Manipal Institute of Technology, Sikkim Manipal University, Rangpo, India
Akash Kumar Bhoi
Department of Electronic and Computer Engineering, Brunel University London, Uxbridge, Middlesex, United Kingdom
Ahmed F. Zobaa

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sangeetha, J., Jayasankar, T. (2019). Emotion Speech Recognition Based on Adaptive Fractional Deep Belief Network and Reinforcement Learning. In: Mallick, P., Balas, V., Bhoi, A., Zobaa, A. (eds) Cognitive Informatics and Soft Computing. Advances in Intelligent Systems and Computing, vol 768. Springer, Singapore. https://doi.org/10.1007/978-981-13-0617-4_16

Download citation

DOI: https://doi.org/10.1007/978-981-13-0617-4_16
Published: 12 August 2018
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-0616-7
Online ISBN: 978-981-13-0617-4
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics