Affective Music Information Retrieval

Wang, Ju-Chiang; Yang, Yi-Hsuan; Wang, Hsin-Min

doi:10.1007/978-3-319-31413-6_12

Ju-Chiang Wang⁸,
Yi-Hsuan Yang⁹ &
Hsin-Min Wang⁸

Part of the book series: Human–Computer Interaction Series ((HCIS))

2306 Accesses
3 Citations

Abstract

Much of the appeal of music lies in its power to convey emotions/moods and to evoke them in listeners. In consequence, the past decade witnessed a growing interest in modeling emotions from musical signals in the music information retrieval (MIR) community. In this chapter, we present a novel generative approach to music emotion modeling, with a specific focus on the valence–arousal (VA) dimension model of emotion. The presented generative model, called acoustic emotion Gaussians (AEG), better accounts for the subjectivity of emotion perception by the use of probability distributions. Specifically, it learns from the emotion annotations of multiple subjects a Gaussian mixture model in the VA space with prior constraints on the corresponding acoustic features of the training music pieces. Such a computational framework is technically sound, capable of learning in an online fashion, and thus applicable to a variety of applications, including user-independent (general) and user-dependent (personalized) emotion recognition, emotion-based music retrieval, and tag-to-VA projection. We report evaluations of the aforementioned applications of AEG on a larger-scale emotion-annotated corpora, AMG1608, to demonstrate the effectiveness of AEG and to showcase how evaluations are conducted for research on emotion-based MIR. Directions of future work are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Barthet, M., Fazekas, G., Sandler, M.: Multidisciplinary perspectives on music emotion recognition: implications for content and context-based models. In: Proceedings International Symposium Computer Music Modeling and Retrieval, pp. 492–507 (2012)
Google Scholar
Bigand, E., Vieillard, S., Madurell, F., Marozeau, J., Dacquet, A.: Multidimensional scaling of emotional responses to music: the effect of musical expertise and of the duration of the excerpts. Cogn. Emot. 19(8), 1113–1139 (2005)
Article Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Bottou, L.: Online algorithms and stochastic approximations. In: Saad, D. (ed.) Online Learning and Neural Networks. Cambridge University Press, Cambridge (1998)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2(3), 27:1–27:39 (2011)
Google Scholar
Chen, Y.A., Wang, J.C., Yang, Y.H., Chen, H.H.: Linear regression-based adaptation of music emotion recognition models for personalization. In: Proceedings IEEE International Conference Acoustics, Speech, and Signal Processing, pp. 2149–2153 (2014)
Google Scholar
Chen, Y.A., Yang, Y.H., Wang, J.C., Chen, H.H.: The AMG1608 dataset for music emotion recognition. In: Proceedings IEEE International Conference Acoustics, Speech, and Signal Processing (2015). http://mpac.ee.ntu.edu.tw/dataset/AMG1608/
Chou, W.: Minimum classification error approach in pattern recognition. In: Chou, W., Juang, B.H. (eds.) Pattern Recognition in Speech and Language Processing. CRC Press, New York (2003)
Google Scholar
Collier, G.: Beyond valence and activity in the emotional connotations of music. Psychol. Music 35(1), 110–131 (2007)
Article Google Scholar
Davis, J.V., Dhillon, I.S.: Differential entropic clustering of multivariate Gaussians. Adv. Neural Inf. Process. Syst. 19, 337–344 (2007)
Google Scholar
Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28(4), 357–366 (1980)
Google Scholar
Eerola, T.: Modelling emotions in music: advances in conceptual, contextual and validity issues. In: Proceedings AES International Conference (2014)
Google Scholar
Eerola, T., Vuoskoski, J.K.: A comparison of the discrete and dimensional models of emotion in music. Psychol. Music 39, 18–49 (2010)
Google Scholar
Gabrielsson, A.: Emotion perceived and emotion felt: same or different? Musicae Scientiae pp. 123–147 (2002)
Google Scholar
Gauvain, J., Lee, C.H.: Maximum a posteriori estimation for multivariate Gaussian mixture observations of markov chains. IEEE Trans. Speech Audio Process. 2, 291–298 (1994)
Article Google Scholar
Gillet, O., Richard, G.: Automatic transcription of drum loops. In: Proceedings IEEE International Conference Acoutstics, Speech, and Signal Processing, pp. 269–272 (2004)
Google Scholar
Hallam, S., Cross, I., Thaut, M.: The Oxford Handbook of Music Psychology. Oxford University Press, Oxford (2008)
Google Scholar
Hevner, K.: Expression in music: a discussion of experimental studies and theories. Psychol. Rev. 48(2), 186–204 (1935)
Article Google Scholar
Hoffman, M., Blei, D., Cook, P.: Easy as CBA: a simple probabilistic model for tagging music. In: Proceedings International Society Music Information Retrieval Conference, pp. 369–374 (2009)
Google Scholar
Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings ACM SIGIR Conference Research and Development in Information Retrieval, pp. 50–57 (1999)
Google Scholar
Hu, X., Downie, J.S.: When lyrics outperform audio for music mood classification: a feature analysis. In: Proceedings International Society Music Information Retrieval Conference, pp. 619–624 (2010)
Google Scholar
Hu, X., Yang, Y.H.: A study on cross-cultural and cross-dataset generalizability of music mood regression models. In: Proceedings Sound and Music Computing Conference (2014)
Google Scholar
Hu, X., Downie, J.S., Laurier, C., Bay, M., Ehmann, A.F.: The 2007 MIREX audio mood classification task: Lessons learned. In: Proceedings International Society Music Information Retrieval Conference, pp. 462–467 (2008)
Google Scholar
Huq, A., Bello, J.P., Rowe, R.: Automated music emotion recognition: a systematic evaluation. J. New Music Res. 39(3), 227–244 (2010)
Article Google Scholar
Huron, D.: Sweet Anticipation: Music and the Psychology of Expectation. MIT Press, Cambridge (2006)
Google Scholar
Imbrasaite, V., Baltrusaitis, T., Robinson, P.: Emotion tracking in music using continuous conditional random fields and relative feature representation. In: Proceedings International Works Affective Analysis in Multimedia (2013)
Google Scholar
Jarvelin, K., Kekalainen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inf. Syst. 20(4), 422–446 (2002)
Article Google Scholar
Juang, B.H., Chou, W., Lee, C.H.: Minimum classification error rate methods for speech recognition. IEEE Trans. Speech Audio Process. 5(3), 257–265 (1997)
Article Google Scholar
Juslin, P.N.: Cue utilization in communication of emotion in music performance: relating performance to perception. J. Exp. Psychol. Hum. Percept. Perform. 16(6), 1797–1813 (2000)
Google Scholar
Juslin, P., Laukka, P.: Expression, perception, and induction of musical emotions: a review and a questionnaire study of everyday listening. J. New Music Res. 33(3), 217–238 (2004)
Article Google Scholar
Juslin, P.N., Sloboda, J.A.: Music and Emotion: Theory and Research. Oxford University Press, New York (2001)
Google Scholar
Kim, Y.E., Schmidt, E.M., Migneco, R., Morton, B.G., Richardson, P., Scott, J.J., Speck, J.A., Turnbull, D.: Music emotion recognition: A state of the art review. In: Proceedings International Society Music Information Retrieval Conference, pp. 255–266 (2010)
Google Scholar
Korhonen, M.D., Clausi, D.A., Jernigan, M.E.: Modeling emotional content of music using system identification. IEEE Trans. Syst. Man Cybern. 36(3), 588–599 (2006)
Google Scholar
Kullback, S., Leibler, R.A.: On information and sufficiency. Ann. Math. Stat. 22(1), 79–86 (1951)
Article MathSciNet MATH Google Scholar
Lartillot, O., Toiviainen, P.: A matlab toolbox for musical feature extraction from audio. In: Proceedings International Conference Digital Audio Effects, pp. 237–244 (2007)
Google Scholar
Lonsdale, A.J., North, A.C.: Why do we listen to music? A uses and gratifications analysis. Br. J. Psychol. 102, 108–134 (2011)
Article Google Scholar
Lu, L., Liu, D., Zhang, H.: Automatic mood detection and tracking of music audio signals. IEEE Trans. Audio Speech Lang. Process. 14(1), 5–18 (2006)
Google Scholar
MacDorman, K.F., Ough, S., Ho, C.C.: Automatic emotion prediction of song excerpts: index construction, algorithm design, and empirical comparison. J. New Music Res. 36(4), 281–299 (2007)
Article Google Scholar
Madsen, J., Jensen, B.S., Larsen, J.: Modeling temporal structure in music for emotion prediction using pairwise comparisons. In: Proceedings International Society Music Information Retrieval Conference, pp. 319–324 (2014)
Google Scholar
Makhoul, J.: Linear prediction: a tutorial review. Proc. IEEE 63(4), 561–580 (1975)
Article Google Scholar
Mathieu, B., Essid, S., Fillon, T., Prado, J., Richard, G.: YAAFE, an easy to use and efficient audio feature extraction software. In: Proceedings International Society Music Information Retrieval Conference, pp. 441–446 (2010)
Google Scholar
Panda, R., Rocha, B., Paiva, R.P.: Dimensional music emotion recognition: Combining standard and melodic audio features. In: Proceedings International Symposium Computer Music Modeling and Retrieval (2013)
Google Scholar
Paolacci, G., Chandler, J., Ipeirotis, P.: Running experiments on Amazon Mechanical Turk. Judgm. Decis. Making 5(5), 411–419 (2010)
Google Scholar
Peeters, G.: A large set of audio features for sound description (similarity and classification) in the CUIDADO project. Technical report, IRCAM, Paris, France (2004)
Google Scholar
Pesek, M., et al.: Gathering a dataset of multi-modal mood-dependent perceptual responses to music. In: Proceedings the EMPIRE Workshop (2014)
Google Scholar
Raykar, V.C., Yu, S., Zhao, L.H., Valadez, G.H., Florin, C., Bogoni, L., Moy, L.: Learning from crowds. J. Mach. Learn. Res. 11, 1297–1322 (2010)
MathSciNet Google Scholar
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted Gaussian mixture models. Digital Signal Process. 10(1–3), 19–41 (2000)
Article Google Scholar
Russell, J.A.: A circumplex model of affect. J. Pers. Social Sci. 39(6), 1161–1178 (1980)
Article Google Scholar
Saari, P., Eerola, T.: Semantic computing of moods based on tags in social media of music. IEEE Trans. Knowl. Data Eng. 26(10), 2548–2560 (2014)
Article Google Scholar
Saari, P., Eerola, T., Fazekasy, G., Barthet, M., Lartillot, O., Sandler, M.: The role of audio and tags in music mood prediction: a study using semantic layer projection. In: Proceedings International Society Music Information Retrieval Conference, pp. 201–206 (2013)
Google Scholar
Schmidt, E.M., Kim, Y.E.: Prediction of time-varying musical mood distributions from audio. In: Proceedings International Society Music Information Retrieval Conference, pp. 465–470 (2010)
Google Scholar
Schmidt, E.M., Kim, Y.E.: Modeling musical emotion dynamics with conditional random fields. In: Proceedings International Society Music Information Retrieval Conference, pp. 777–782 (2011)
Google Scholar
Schmidt, E.M., Kim, Y.E.: Learning rhythm and melody features with deep belief networks. In: Proceedings International Society Music Information Retrieval Conference, pp. 21–26 (2013)
Google Scholar
Schölkopf, B., Smola, A.J., Williamson, R.C., Bartlett, P.L.: New support vector algorithms. Neural Comput. 12, 1207–1245 (2000)
Article Google Scholar
Schubert, E.: Modeling perceived emotion with continuous musical features. Music Percept. 21(4), 561–585 (2004)
Article Google Scholar
Schuller, B., Hage, C., Schuller, D., Rigoll, G.: ‘Mister D.J., Cheer Me Up!’: musical and textual features for automatic mood classification. J. New Music Res. 39(1), 13–34 (2010)
Google Scholar
Sen, A., Srivastava, M.S.: Regression Analysis: Theory, Methods, and Applications. Springer Science & Business Media (1990)
Google Scholar
Soleymani, M., Caro, M.N., Schmidt, E., Sha, C.Y., Yang, Y.H.: 1000 songs for emotional analysis of music. In: Proceedings International Workshop Crowdsourcing for Multimedia, pp. 1–6 (2013)
Google Scholar
Soleymani, M., Aljanaki, A., Yang, Y.H., Caro, M.N., Eyben, F., Markov, K., Schuller, B., Veltkamp, R., Weninger, F., Wiering, F.: Emotional analysis of music: a comparison of methods. In: Proceedings ACM Multimedia, pp. 1161–1164 (2014)
Google Scholar
Su, L., Yeh, C.C.M., Liu, J.Y., Wang, J.C., Yang, Y.H.: A systematic evaluation of the bag-of-frames representation for music information retrieval. IEEE Trans. Multimedia 16(5), 1188–1200 (2014)
Article Google Scholar
Wang, M.Y., Zhang, N.Y., Zhu, H.C.: User-adaptive music emotion recognition. In: Proceedings IEEE International Conference Signal Processing, pp. 1352–1355 (2004)
Google Scholar
Wang, J.C., Lee, H.S., Wang, H.M., Jeng, S.K.: Learning the similarity of audio music in bag-of-frames representation from tagged music data. In: Proceedings International Society Music Information Retrieval Conference, pp. 85–90 (2011)
Google Scholar
Wang, J.C., Wang, H.M., Jeng, S.K.: Playing with tagging: a real-time tagging music player. In: Proceedings IEEE International Conference Acoustics, Speech, and Signal Processing, pp. 77–80 (2012)
Google Scholar
Wang, J.C., Yang, Y.H., Chang, K., Wang, H.M., Jeng, S.K.: Exploring the relationship between categorical and dimensional emotion semantics of music. In: Proceedings ACM International Workshop Music Information Retrieval with User-Centered and Multimodal Strategies, pp. 63–68 (2012)
Google Scholar
Wang, J.C., Yang, Y.H., Jhuo, I., Lin, Y.Y., Wang, H.M.: The acousticvisual emotion Gaussians model for automatic generation of music video. In: Proceedings ACM Multimedia, pp. 1379–1380 (2012)
Google Scholar
Wang, J.C., Yang, Y.H., Wang, H.M., Jeng, S.K.: The acoustic emotion Gaussians model for emotion-based music annotation and retrieval. In: Proceedings ACM Multimedia, pp. 89–98 (2012)
Google Scholar
Wang, J.C., Yang, Y.H., Wang, H.M., Jeng, S.K.: Personalized music emotion recognition via model adaptation. In: Proceedings APSIPA Annual Summit & Conference (2012)
Google Scholar
Wang, X., Wu, Y., Chen, X., Yang, D.: A two-layer model for music pleasure regression. In: Proceedings International Workshop Affective Analysis in Multimedia (2013)
Google Scholar
Wang, S.Y., Wang, J.C., Yang, Y.H., Wang, H.M.: Towards time-varying music auto-tagging based on CAL500 expansion. In: Proceedings IEEE International Conference Multimedia and Expo, pp. 1–6 (2014)
Google Scholar
Wang, J.C., Wang, H.M., Lanckriet, G.: A histogram density modeling approach to music emotion recognition. In: Proceedings IEEE International Conference Acoustics, Speech, and Signal Processing, pp. 698–702 (2015)
Google Scholar
Wang, J.C., Yang, Y.H., Wang, H.M., Jeng, S.K.: Modeling the affective content of music with a Gaussian mixture model. IEEE Trans. Affect. Comput. 6(1), 56–68 (2015)
Article Google Scholar
Weninger, F., Eyben, F., Schuller, B.: On-line continuous-time music mood regression with deep recurrent neural networks. In: Proceedings IEEE International Conference Acoustics, Speech, and Signal Processing, pp. 5449–5453 (2014)
Google Scholar
Yang, Y.H., Chen, H.H.: Music Emotion Recognition. CRC Press, Boca Raton (2011)
Google Scholar
Yang, Y.H., Chen, H.H.: Predicting the distribution of perceived emotions of a music signal for content retrieval. IEEE Trans. Audio Speech Lang. Process. 19(7), 2184–2196 (2011)
Google Scholar
Yang, Y.H., Chen, H.H.: Ranking-based emotion recognition for music organization and retrieval. IEEE Trans. Audio Speech Lang. Process. 19(4), 762–774 (2011)
Google Scholar
Yang, Y.H., Chen, H.H.: Machine recognition of music emotion: a review. ACM Trans. Intell. Syst. Technol. 3(4) (2012)
Google Scholar
Yang, Y.H., Liu, J.Y.: Quantitative study of music listening behavior in a social and affective context. IEEE Trans. Multimedia 15(6), 1304–1315 (2013)
Article Google Scholar
Yang, Y.H., Su, Y.F., Lin, Y.C., Chen, H.H.: Music emotion recognition: The role of individuality. In: Proceedings ACM International Workshop Human-Centered Multimedia, pp. 13–21 (2007)
Google Scholar
Yang, Y.H., Lin, Y.C., Cheng, H.T., Chen, H.H.: Mr. Emo: Music retrieval in the emotion plane. In: Proceedings ACM Multimedia, pp. 1003–1004 (2008)
Google Scholar
Yang, Y.H., Lin, Y.C., Su, Y.F., Chen, H.H.: A regression approach to music emotion recognition. IEEE Trans. Audio Speech Lang. Process. 16(2), 448–457 (2008)
Google Scholar
Yang, Y.H., Lin, Y.C., Chen, H.H.: Personalized music emotion recognition. In: Proceedings ACM SIGIR International Conference Research and Development in Information Retrieval, pp. 748–749 (2009)
Google Scholar
Yang, Y.H., Wang, J.C., Chen, Y.A., Chen, H.H.: Model adaptation for personalized music emotion recognition. In: Chen, C.H. (ed.) Handbook of Pattern Recognition and Computer Vision, 5th Edition, World Scientific Publishing Co., Singapore (2015)
Google Scholar
Yeh, C.C., Tseng, S.S., Tsai, P.C., Weng, J.F.: Building a personalized music emotion prediction system. In: Advances in Multimedia Information Processing-PCM 2006, pp. 730–739. Springer (2006)
Google Scholar
Zentner, M., Grandjean, D., Scherer, K.R.: Emotions evoked by the sound of music: characterization, classification, and measurement. Emotion 8(4), 494 (2008)
Article Google Scholar
Zhu, B., Liu, T.: Research on emotional vocabulary-driven personalized music retrieval. In: Edutainment, pp. 252–261 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Information Science, Academia Sinica, Taipei, 11529, Taiwan
Ju-Chiang Wang & Hsin-Min Wang
Research Center for IT Innovation, Academia Sinica, Taipei, 11529, Taiwan
Yi-Hsuan Yang

Authors

Ju-Chiang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yi-Hsuan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Hsin-Min Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ju-Chiang Wang .

Editor information

Editors and Affiliations

Department of Computational Perception, Johannes Kepler University, Linz, Austria
Marko Tkalčič
Department of Computer Science, University of Bari Aldo Moro, Bari, Italy
Berardina De Carolis
Department of Computer Science, University of Bari Aldo Moro, Bari, Italy
Marco de Gemmis
Preserje, Ljubljana, Slovenia
Ante Odić
Faculty of electrical engineering, University of Ljubljana, Ljubljana, Slovenia
Andrej Košir

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Wang, JC., Yang, YH., Wang, HM. (2016). Affective Music Information Retrieval. In: Tkalčič, M., De Carolis, B., de Gemmis, M., Odić, A., Košir, A. (eds) Emotions and Personality in Personalized Services. Human–Computer Interaction Series. Springer, Cham. https://doi.org/10.1007/978-3-319-31413-6_12

Download citation

DOI: https://doi.org/10.1007/978-3-319-31413-6_12
Published: 14 July 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-31411-2
Online ISBN: 978-3-319-31413-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics