Skip to main content

Affective Music Information Retrieval

  • Chapter
  • First Online:
Emotions and Personality in Personalized Services

Part of the book series: Human–Computer Interaction Series ((HCIS))

Abstract

Much of the appeal of music lies in its power to convey emotions/moods and to evoke them in listeners. In consequence, the past decade witnessed a growing interest in modeling emotions from musical signals in the music information retrieval (MIR) community. In this chapter, we present a novel generative approach to music emotion modeling, with a specific focus on the valence–arousal (VA) dimension model of emotion. The presented generative model, called acoustic emotion Gaussians (AEG), better accounts for the subjectivity of emotion perception by the use of probability distributions. Specifically, it learns from the emotion annotations of multiple subjects a Gaussian mixture model in the VA space with prior constraints on the corresponding acoustic features of the training music pieces. Such a computational framework is technically sound, capable of learning in an online fashion, and thus applicable to a variety of applications, including user-independent (general) and user-dependent (personalized) emotion recognition, emotion-based music retrieval, and tag-to-VA projection. We report evaluations of the aforementioned applications of AEG on a larger-scale emotion-annotated corpora, AMG1608, to demonstrate the effectiveness of AEG and to showcase how evaluations are conducted for research on emotion-based MIR. Directions of future work are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Barthet, M., Fazekas, G., Sandler, M.: Multidisciplinary perspectives on music emotion recognition: implications for content and context-based models. In: Proceedings International Symposium Computer Music Modeling and Retrieval, pp. 492–507 (2012)

    Google Scholar 

  2. Bigand, E., Vieillard, S., Madurell, F., Marozeau, J., Dacquet, A.: Multidimensional scaling of emotional responses to music: the effect of musical expertise and of the duration of the excerpts. Cogn. Emot. 19(8), 1113–1139 (2005)

    Article  Google Scholar 

  3. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)

    Google Scholar 

  4. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)

    MATH  Google Scholar 

  5. Bottou, L.: Online algorithms and stochastic approximations. In: Saad, D. (ed.) Online Learning and Neural Networks. Cambridge University Press, Cambridge (1998)

    Google Scholar 

  6. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27:1–27:39 (2011)

    Google Scholar 

  7. Chen, Y.A., Wang, J.C., Yang, Y.H., Chen, H.H.: Linear regression-based adaptation of music emotion recognition models for personalization. In: Proceedings IEEE International Conference Acoustics, Speech, and Signal Processing, pp. 2149–2153 (2014)

    Google Scholar 

  8. Chen, Y.A., Yang, Y.H., Wang, J.C., Chen, H.H.: The AMG1608 dataset for music emotion recognition. In: Proceedings IEEE International Conference Acoustics, Speech, and Signal Processing (2015). http://mpac.ee.ntu.edu.tw/dataset/AMG1608/

  9. Chou, W.: Minimum classification error approach in pattern recognition. In: Chou, W., Juang, B.H. (eds.) Pattern Recognition in Speech and Language Processing. CRC Press, New York (2003)

    Google Scholar 

  10. Collier, G.: Beyond valence and activity in the emotional connotations of music. Psychol. Music 35(1), 110–131 (2007)

    Article  Google Scholar 

  11. Davis, J.V., Dhillon, I.S.: Differential entropic clustering of multivariate Gaussians. Adv. Neural Inf. Process. Syst. 19, 337–344 (2007)

    Google Scholar 

  12. Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28(4), 357–366 (1980)

    Google Scholar 

  13. Eerola, T.: Modelling emotions in music: advances in conceptual, contextual and validity issues. In: Proceedings AES International Conference (2014)

    Google Scholar 

  14. Eerola, T., Vuoskoski, J.K.: A comparison of the discrete and dimensional models of emotion in music. Psychol. Music 39, 18–49 (2010)

    Google Scholar 

  15. Gabrielsson, A.: Emotion perceived and emotion felt: same or different? Musicae Scientiae pp. 123–147 (2002)

    Google Scholar 

  16. Gauvain, J., Lee, C.H.: Maximum a posteriori estimation for multivariate Gaussian mixture observations of markov chains. IEEE Trans. Speech Audio Process. 2, 291–298 (1994)

    Article  Google Scholar 

  17. Gillet, O., Richard, G.: Automatic transcription of drum loops. In: Proceedings IEEE International Conference Acoutstics, Speech, and Signal Processing, pp. 269–272 (2004)

    Google Scholar 

  18. Hallam, S., Cross, I., Thaut, M.: The Oxford Handbook of Music Psychology. Oxford University Press, Oxford (2008)

    Google Scholar 

  19. Hevner, K.: Expression in music: a discussion of experimental studies and theories. Psychol. Rev. 48(2), 186–204 (1935)

    Article  Google Scholar 

  20. Hoffman, M., Blei, D., Cook, P.: Easy as CBA: a simple probabilistic model for tagging music. In: Proceedings International Society Music Information Retrieval Conference, pp. 369–374 (2009)

    Google Scholar 

  21. Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings ACM SIGIR Conference Research and Development in Information Retrieval, pp. 50–57 (1999)

    Google Scholar 

  22. Hu, X., Downie, J.S.: When lyrics outperform audio for music mood classification: a feature analysis. In: Proceedings International Society Music Information Retrieval Conference, pp. 619–624 (2010)

    Google Scholar 

  23. Hu, X., Yang, Y.H.: A study on cross-cultural and cross-dataset generalizability of music mood regression models. In: Proceedings Sound and Music Computing Conference (2014)

    Google Scholar 

  24. Hu, X., Downie, J.S., Laurier, C., Bay, M., Ehmann, A.F.: The 2007 MIREX audio mood classification task: Lessons learned. In: Proceedings International Society Music Information Retrieval Conference, pp. 462–467 (2008)

    Google Scholar 

  25. Huq, A., Bello, J.P., Rowe, R.: Automated music emotion recognition: a systematic evaluation. J. New Music Res. 39(3), 227–244 (2010)

    Article  Google Scholar 

  26. Huron, D.: Sweet Anticipation: Music and the Psychology of Expectation. MIT Press, Cambridge (2006)

    Google Scholar 

  27. Imbrasaite, V., Baltrusaitis, T., Robinson, P.: Emotion tracking in music using continuous conditional random fields and relative feature representation. In: Proceedings International Works Affective Analysis in Multimedia (2013)

    Google Scholar 

  28. Jarvelin, K., Kekalainen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)

    Article  Google Scholar 

  29. Juang, B.H., Chou, W., Lee, C.H.: Minimum classification error rate methods for speech recognition. IEEE Trans. Speech Audio Process. 5(3), 257–265 (1997)

    Article  Google Scholar 

  30. Juslin, P.N.: Cue utilization in communication of emotion in music performance: relating performance to perception. J. Exp. Psychol. Hum. Percept. Perform. 16(6), 1797–1813 (2000)

    Google Scholar 

  31. Juslin, P., Laukka, P.: Expression, perception, and induction of musical emotions: a review and a questionnaire study of everyday listening. J. New Music Res. 33(3), 217–238 (2004)

    Article  Google Scholar 

  32. Juslin, P.N., Sloboda, J.A.: Music and Emotion: Theory and Research. Oxford University Press, New York (2001)

    Google Scholar 

  33. Kim, Y.E., Schmidt, E.M., Migneco, R., Morton, B.G., Richardson, P., Scott, J.J., Speck, J.A., Turnbull, D.: Music emotion recognition: A state of the art review. In: Proceedings International Society Music Information Retrieval Conference, pp. 255–266 (2010)

    Google Scholar 

  34. Korhonen, M.D., Clausi, D.A., Jernigan, M.E.: Modeling emotional content of music using system identification. IEEE Trans. Syst. Man Cybern. 36(3), 588–599 (2006)

    Google Scholar 

  35. Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)

    Article  MathSciNet  MATH  Google Scholar 

  36. Lartillot, O., Toiviainen, P.: A matlab toolbox for musical feature extraction from audio. In: Proceedings International Conference Digital Audio Effects, pp. 237–244 (2007)

    Google Scholar 

  37. Lonsdale, A.J., North, A.C.: Why do we listen to music? A uses and gratifications analysis. Br. J. Psychol. 102, 108–134 (2011)

    Article  Google Scholar 

  38. Lu, L., Liu, D., Zhang, H.: Automatic mood detection and tracking of music audio signals. IEEE Trans. Audio Speech Lang. Process. 14(1), 5–18 (2006)

    Google Scholar 

  39. MacDorman, K.F., Ough, S., Ho, C.C.: Automatic emotion prediction of song excerpts: index construction, algorithm design, and empirical comparison. J. New Music Res. 36(4), 281–299 (2007)

    Article  Google Scholar 

  40. Madsen, J., Jensen, B.S., Larsen, J.: Modeling temporal structure in music for emotion prediction using pairwise comparisons. In: Proceedings International Society Music Information Retrieval Conference, pp. 319–324 (2014)

    Google Scholar 

  41. Makhoul, J.: Linear prediction: a tutorial review. Proc. IEEE 63(4), 561–580 (1975)

    Article  Google Scholar 

  42. Mathieu, B., Essid, S., Fillon, T., Prado, J., Richard, G.: YAAFE, an easy to use and efficient audio feature extraction software. In: Proceedings International Society Music Information Retrieval Conference, pp. 441–446 (2010)

    Google Scholar 

  43. Panda, R., Rocha, B., Paiva, R.P.: Dimensional music emotion recognition: Combining standard and melodic audio features. In: Proceedings International Symposium Computer Music Modeling and Retrieval (2013)

    Google Scholar 

  44. Paolacci, G., Chandler, J., Ipeirotis, P.: Running experiments on Amazon Mechanical Turk. Judgm. Decis. Making 5(5), 411–419 (2010)

    Google Scholar 

  45. Peeters, G.: A large set of audio features for sound description (similarity and classification) in the CUIDADO project. Technical report, IRCAM, Paris, France (2004)

    Google Scholar 

  46. Pesek, M., et al.: Gathering a dataset of multi-modal mood-dependent perceptual responses to music. In: Proceedings the EMPIRE Workshop (2014)

    Google Scholar 

  47. Raykar, V.C., Yu, S., Zhao, L.H., Valadez, G.H., Florin, C., Bogoni, L., Moy, L.: Learning from crowds. J. Mach. Learn. Res. 11, 1297–1322 (2010)

    MathSciNet  Google Scholar 

  48. Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian mixture models. Digital Signal Process. 10(1–3), 19–41 (2000)

    Article  Google Scholar 

  49. Russell, J.A.: A circumplex model of affect. J. Pers. Social Sci. 39(6), 1161–1178 (1980)

    Article  Google Scholar 

  50. Saari, P., Eerola, T.: Semantic computing of moods based on tags in social media of music. IEEE Trans. Knowl. Data Eng. 26(10), 2548–2560 (2014)

    Article  Google Scholar 

  51. Saari, P., Eerola, T., Fazekasy, G., Barthet, M., Lartillot, O., Sandler, M.: The role of audio and tags in music mood prediction: a study using semantic layer projection. In: Proceedings International Society Music Information Retrieval Conference, pp. 201–206 (2013)

    Google Scholar 

  52. Schmidt, E.M., Kim, Y.E.: Prediction of time-varying musical mood distributions from audio. In: Proceedings International Society Music Information Retrieval Conference, pp. 465–470 (2010)

    Google Scholar 

  53. Schmidt, E.M., Kim, Y.E.: Modeling musical emotion dynamics with conditional random fields. In: Proceedings International Society Music Information Retrieval Conference, pp. 777–782 (2011)

    Google Scholar 

  54. Schmidt, E.M., Kim, Y.E.: Learning rhythm and melody features with deep belief networks. In: Proceedings International Society Music Information Retrieval Conference, pp. 21–26 (2013)

    Google Scholar 

  55. Schölkopf, B., Smola, A.J., Williamson, R.C., Bartlett, P.L.: New support vector algorithms. Neural Comput. 12, 1207–1245 (2000)

    Article  Google Scholar 

  56. Schubert, E.: Modeling perceived emotion with continuous musical features. Music Percept. 21(4), 561–585 (2004)

    Article  Google Scholar 

  57. Schuller, B., Hage, C., Schuller, D., Rigoll, G.: ‘Mister D.J., Cheer Me Up!’: musical and textual features for automatic mood classification. J. New Music Res. 39(1), 13–34 (2010)

    Google Scholar 

  58. Sen, A., Srivastava, M.S.: Regression Analysis: Theory, Methods, and Applications. Springer Science & Business Media (1990)

    Google Scholar 

  59. Soleymani, M., Caro, M.N., Schmidt, E., Sha, C.Y., Yang, Y.H.: 1000 songs for emotional analysis of music. In: Proceedings International Workshop Crowdsourcing for Multimedia, pp. 1–6 (2013)

    Google Scholar 

  60. Soleymani, M., Aljanaki, A., Yang, Y.H., Caro, M.N., Eyben, F., Markov, K., Schuller, B., Veltkamp, R., Weninger, F., Wiering, F.: Emotional analysis of music: a comparison of methods. In: Proceedings ACM Multimedia, pp. 1161–1164 (2014)

    Google Scholar 

  61. Su, L., Yeh, C.C.M., Liu, J.Y., Wang, J.C., Yang, Y.H.: A systematic evaluation of the bag-of-frames representation for music information retrieval. IEEE Trans. Multimedia 16(5), 1188–1200 (2014)

    Article  Google Scholar 

  62. Wang, M.Y., Zhang, N.Y., Zhu, H.C.: User-adaptive music emotion recognition. In: Proceedings IEEE International Conference Signal Processing, pp. 1352–1355 (2004)

    Google Scholar 

  63. Wang, J.C., Lee, H.S., Wang, H.M., Jeng, S.K.: Learning the similarity of audio music in bag-of-frames representation from tagged music data. In: Proceedings International Society Music Information Retrieval Conference, pp. 85–90 (2011)

    Google Scholar 

  64. Wang, J.C., Wang, H.M., Jeng, S.K.: Playing with tagging: a real-time tagging music player. In: Proceedings IEEE International Conference Acoustics, Speech, and Signal Processing, pp. 77–80 (2012)

    Google Scholar 

  65. Wang, J.C., Yang, Y.H., Chang, K., Wang, H.M., Jeng, S.K.: Exploring the relationship between categorical and dimensional emotion semantics of music. In: Proceedings ACM International Workshop Music Information Retrieval with User-Centered and Multimodal Strategies, pp. 63–68 (2012)

    Google Scholar 

  66. Wang, J.C., Yang, Y.H., Jhuo, I., Lin, Y.Y., Wang, H.M.: The acousticvisual emotion Gaussians model for automatic generation of music video. In: Proceedings ACM Multimedia, pp. 1379–1380 (2012)

    Google Scholar 

  67. Wang, J.C., Yang, Y.H., Wang, H.M., Jeng, S.K.: The acoustic emotion Gaussians model for emotion-based music annotation and retrieval. In: Proceedings ACM Multimedia, pp. 89–98 (2012)

    Google Scholar 

  68. Wang, J.C., Yang, Y.H., Wang, H.M., Jeng, S.K.: Personalized music emotion recognition via model adaptation. In: Proceedings APSIPA Annual Summit & Conference (2012)

    Google Scholar 

  69. Wang, X., Wu, Y., Chen, X., Yang, D.: A two-layer model for music pleasure regression. In: Proceedings International Workshop Affective Analysis in Multimedia (2013)

    Google Scholar 

  70. Wang, S.Y., Wang, J.C., Yang, Y.H., Wang, H.M.: Towards time-varying music auto-tagging based on CAL500 expansion. In: Proceedings IEEE International Conference Multimedia and Expo, pp. 1–6 (2014)

    Google Scholar 

  71. Wang, J.C., Wang, H.M., Lanckriet, G.: A histogram density modeling approach to music emotion recognition. In: Proceedings IEEE International Conference Acoustics, Speech, and Signal Processing, pp. 698–702 (2015)

    Google Scholar 

  72. Wang, J.C., Yang, Y.H., Wang, H.M., Jeng, S.K.: Modeling the affective content of music with a Gaussian mixture model. IEEE Trans. Affect. Comput. 6(1), 56–68 (2015)

    Article  Google Scholar 

  73. Weninger, F., Eyben, F., Schuller, B.: On-line continuous-time music mood regression with deep recurrent neural networks. In: Proceedings IEEE International Conference Acoustics, Speech, and Signal Processing, pp. 5449–5453 (2014)

    Google Scholar 

  74. Yang, Y.H., Chen, H.H.: Music Emotion Recognition. CRC Press, Boca Raton (2011)

    Google Scholar 

  75. Yang, Y.H., Chen, H.H.: Predicting the distribution of perceived emotions of a music signal for content retrieval. IEEE Trans. Audio Speech Lang. Process. 19(7), 2184–2196 (2011)

    Google Scholar 

  76. Yang, Y.H., Chen, H.H.: Ranking-based emotion recognition for music organization and retrieval. IEEE Trans. Audio Speech Lang. Process. 19(4), 762–774 (2011)

    Google Scholar 

  77. Yang, Y.H., Chen, H.H.: Machine recognition of music emotion: a review. ACM Trans. Intell. Syst. Technol. 3(4) (2012)

    Google Scholar 

  78. Yang, Y.H., Liu, J.Y.: Quantitative study of music listening behavior in a social and affective context. IEEE Trans. Multimedia 15(6), 1304–1315 (2013)

    Article  Google Scholar 

  79. Yang, Y.H., Su, Y.F., Lin, Y.C., Chen, H.H.: Music emotion recognition: The role of individuality. In: Proceedings ACM International Workshop Human-Centered Multimedia, pp. 13–21 (2007)

    Google Scholar 

  80. Yang, Y.H., Lin, Y.C., Cheng, H.T., Chen, H.H.: Mr. Emo: Music retrieval in the emotion plane. In: Proceedings ACM Multimedia, pp. 1003–1004 (2008)

    Google Scholar 

  81. Yang, Y.H., Lin, Y.C., Su, Y.F., Chen, H.H.: A regression approach to music emotion recognition. IEEE Trans. Audio Speech Lang. Process. 16(2), 448–457 (2008)

    Google Scholar 

  82. Yang, Y.H., Lin, Y.C., Chen, H.H.: Personalized music emotion recognition. In: Proceedings ACM SIGIR International Conference Research and Development in Information Retrieval, pp. 748–749 (2009)

    Google Scholar 

  83. Yang, Y.H., Wang, J.C., Chen, Y.A., Chen, H.H.: Model adaptation for personalized music emotion recognition. In: Chen, C.H. (ed.) Handbook of Pattern Recognition and Computer Vision, 5th Edition, World Scientific Publishing Co., Singapore (2015)

    Google Scholar 

  84. Yeh, C.C., Tseng, S.S., Tsai, P.C., Weng, J.F.: Building a personalized music emotion prediction system. In: Advances in Multimedia Information Processing-PCM 2006, pp. 730–739. Springer (2006)

    Google Scholar 

  85. Zentner, M., Grandjean, D., Scherer, K.R.: Emotions evoked by the sound of music: characterization, classification, and measurement. Emotion 8(4), 494 (2008)

    Article  Google Scholar 

  86. Zhu, B., Liu, T.: Research on emotional vocabulary-driven personalized music retrieval. In: Edutainment, pp. 252–261 (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ju-Chiang Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Wang, JC., Yang, YH., Wang, HM. (2016). Affective Music Information Retrieval. In: Tkalčič, M., De Carolis, B., de Gemmis, M., Odić, A., Košir, A. (eds) Emotions and Personality in Personalized Services. Human–Computer Interaction Series. Springer, Cham. https://doi.org/10.1007/978-3-319-31413-6_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-31413-6_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-31411-2

  • Online ISBN: 978-3-319-31413-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics