Emotion Recognition Based on Gramian Encoding Visualization

  • Jie-Lin QiuEmail author
  • Xin-Yi Qiu
  • Kai Hu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11309)


This paper addresses the problem that emotional computing is difficult to be put into real practical fields intuitively, such as medical disease diagnosis and so on, due to poor direct understanding of physiological signals. In view of the fact that people’s ability to understand two-dimensional images is much higher than one-dimensional signals, we use Gramian Angular Fields to visualize time series signals. GAF images are represented as a Gramian matrix where each element is the trigonometric sum between different time intervals. Then we use Tiled Convolutional Neural Networks (tiled CNNs) on 3 real world datasets to learn high-level features from GAF images. The classification results of our method are better than the state-of-the-art approaches. This method makes visualization based emotion recognition become possible, which is beneficial in the real medical fields, such as making cognitive disease diagnosis more intuitively.


Emotion recognition EEG Gramian Angular Fields Tiled CNN Medical diagnosis 


  1. 1.
    Tzirakis, P., Trigeorgis, G., Nicolaou, M.A., Schuller, B.W., Zafeiriou, S.: End-to-end multimodal emotion recognition using deep neural networks. IEEE J. Sel. Top. Signal Process. 11, 1301–1309 (2017)CrossRefGoogle Scholar
  2. 2.
    Lu, Y., Zheng, W.-L., Li, B., Lu, B.-L.: Combining eye movements and EEG to enhance emotion recognition. In: IJCAI (2015)Google Scholar
  3. 3.
    Liu, W., Zheng, W.-L., Lu, B.-L.: Multimodal emotion recognition using multimodal deep learning. CoRR, vol. abs/1602.08225 (2016)Google Scholar
  4. 4.
    Tang, H., Liu, W., Zheng, W.-L., Lu, B.-L.: Multimodal emotion recognition using deep neural networks. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, E.-S.M. (eds.) ICONIP 2017, Part IV. LNCS, vol. 10637, pp. 811–819. Springer, Cham (2017). Scholar
  5. 5.
    Zheng, W.-L., Zhu, J.-Y., Peng, Y., Lu, B.-L.: EEG-based emotion classification using deep belief networks. In: 2014 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6 (2014)Google Scholar
  6. 6.
    Zheng, W.-L., Liu, W., Lu, Y., Lu, B.-L., Cichocki, A.: Emotionmeter: a multimodal framework for recognizing human emotions. IEEE Trans. Cybern. 99, 1–13 (2018)Google Scholar
  7. 7.
    Schuller, B.W., Rigoll, G., Lang, M.K.: Hidden Markov model-based speech emotion recognition. In: ICME (2003)Google Scholar
  8. 8.
    Kim, K.H., Bang, S.W., Kim, S.R.: Emotion recognition system using short-term monitoring of physiological signals. Med. Biol. Eng. Comput. 42, 419–427 (2004)CrossRefGoogle Scholar
  9. 9.
    Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. Speech Audio Process. 3(1), 72–83 (1995)CrossRefGoogle Scholar
  10. 10.
    Leggetter, C., Woodland, P.C.: Maximum likelihood linear regression for speaker adaptation of continuous density hidden Markov models. Comput. Speech Lang. 9, 171–185 (1995)CrossRefGoogle Scholar
  11. 11.
    Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Rahman Mohamed, A., Dahl, G.E., Hinton, G.E.: Acoustic modeling using deep belief networks. IEEE Trans. Audio Speech Lang. Process. 20, 14–22 (2012)CrossRefGoogle Scholar
  13. 13.
    Hinton, G.E., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Process. Mag. 29, 82–97 (2012)CrossRefGoogle Scholar
  14. 14.
    Deng, L., Hinton, G.E., Kingsbury, B.: New types of deep neural network learning for speech recognition and related applications: an overview. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 8599–8603 (2013)Google Scholar
  15. 15.
    Deng, L., et al.: Recent advances in deep learning for speech research at Microsoft. In: 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 8604–8608 (2013)Google Scholar
  16. 16.
    LeCun, Y.: Gradient-based learning applied to document recognition (1998)CrossRefGoogle Scholar
  17. 17.
    Hubel, D.H., Wiesel, T.N.: Receptive fields, binocular interaction and functional architecture in the cats visual cortex. J. Physiol. 160, 106–154 (1962)CrossRefGoogle Scholar
  18. 18.
    Lawrence, S., Giles, C.L., Tsoi, A.C., Back, A.D.: Face recognition: a convolutional neural-network approach. IEEE Trans. Neural Netw. 8(1), 98–113 (1997)CrossRefGoogle Scholar
  19. 19.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)Google Scholar
  20. 20.
    LeCun, Y., Kavukcuoglu, K., Farabet, C.: Convolutional networks and applications in vision. In: Proceedings of 2010 IEEE International Symposium on Circuits and Systems, pp. 253–256 (2010)Google Scholar
  21. 21.
    Erhan, D., et al.: Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11, 625–660 (2010)MathSciNetzbMATHGoogle Scholar
  22. 22.
    Kavukcuoglu, K., et al.: Learning convolutional feature hierarchies for visual recognition. In: NIPS (2010)Google Scholar
  23. 23.
    Le, Q.V., Ngiam, J., Chen, Z., Hao Chia, D.J., Koh, P.W., Ng, A.Y.: Tiled convolutional neural networks. In: NIPS (2010)Google Scholar
  24. 24.
    Abdel-Hamid, O., Rahman Mohamed, A., Jiang, H., Penn, G.: Applying convolutional neural networks concepts to hybrid nn-hmm model for speech recognition. In: 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4277–4280 (2012)Google Scholar
  25. 25.
    Deng, L., Abdel-Hamid, O., Yu, D.: A deep convolutional neural network using heterogeneous pooling for trading acoustic invariance with phonetic confusion. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 6669–6673 (2013)Google Scholar
  26. 26.
    Abdel-Hamid, O., Deng, L., Yu, D.: Exploring convolutional neural network structures and optimization techniques for speech recognition. In: INTER-SPEECH (2013)Google Scholar
  27. 27.
    Campanharo, A.S.L.O., Sirer, M.I., Malmgren, R.D., Ramos, F.M., Amaral, L.A.N.: Duality between time series and networks. PloS One 6, e23378 (2011)CrossRefGoogle Scholar
  28. 28.
    Wang, Z., Oates, T.: Encoding time series as images for visual inspection and classification using tiled convolutional neural networks (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Shanghai Jiao Tong UniversityShanghaiChina
  2. 2.Sun Yat-sen UniversityGuangzhouChina

Personalised recommendations