Distributed Vector Representations of Folksong Motifs

  • Aitor Arronte Alvarez
  • Francisco Gómez-MartinEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11502)


This article presents a distributed vector representation model for learning folksong motifs. A skip-gram version of word2vec with negative sampling is used to represent high quality embeddings. Motifs from the Essen Folksong collection are compared based on their cosine similarity. A new evaluation method for testing the quality of the embeddings based on a melodic similarity task is presented to show how the vector space can represent complex contextual features, and how it can be utilized for the study of folksong variation.


Folksong motifs Melodic context Motif embedding Word2vec 


  1. 1.
    Bengio, Y., Ducharme, R., Vincent, P., Jauvin, C.: A neural probabilistic language model. J. Mach. Learn. Res. 3(Feb), 1137–1155 (2003)zbMATHGoogle Scholar
  2. 2.
    Besson, M., Schön, D.: Comparison between language and music. Ann. New York Acad. Sci. 930(1), 232–258 (2001)CrossRefGoogle Scholar
  3. 3.
    Boom, C.D., et al.: Large-scale user modeling with recurrent neural networks for music discovery on multiple time scales. Multimed. Tools Appl. 77, 15385–15407 (2017)CrossRefGoogle Scholar
  4. 4.
    Boulanger-Lewandowski, N., Bengio, Y., Vincent, P.: Modeling temporal dependencies in high-dimensional sequences: application to polyphonic music generation and transcription. arXiv preprint arXiv:1206.6392 (2012)
  5. 5.
    Clark, S.: Vector space models of lexical meaning. In: Lappin, S., Fox, C. (eds.) The Handbook of Contemporary Semantic Theory, pp. 463–472. Wiley-Blackwell, Hoboken (2015)Google Scholar
  6. 6.
    Conklin, D., Witten, I.H.: Multiple viewpoint systems for music prediction. J. New Music Res. 24(1), 51–73 (1995)CrossRefGoogle Scholar
  7. 7.
    Cuthbert, M.S., Ariza, C.: Music21: A toolkit for computer-aided musicology and symbolic music data. In: ISMIR. Utrecht, The Netherlands (2010)Google Scholar
  8. 8.
    Goldberg, Y., Levy, O.: word2vec explained: deriving mikolov et al’.s negative-sampling word-embedding method. arXiv preprint arXiv:1402.3722 (2014)
  9. 9.
    Harris, Z.S.: Distributional structure. Word 10(2–3), 146–162 (1954)CrossRefGoogle Scholar
  10. 10.
    Herremans, D., Chuan, C.H.: Modeling musical context with word2vec. arXiv preprint arXiv:1706.09088 (2017)
  11. 11.
    Huang, C.Z.A., Duvenaud, D., Gajos, K.Z.: Chordripple: recommending chords to help novice composers go beyond the ordinary. In: Proceedings of the 21st International Conference on Intelligent User Interfaces, pp. 241–250. ACM, Sonoma (2016)Google Scholar
  12. 12.
    Janssen, B., van Kranenburg, P., Volk, A.: Finding occurrences of melodic segments in folk songs employing symbolic similarity measures. J. New Music Res. 46(2), 118–134 (2017)CrossRefGoogle Scholar
  13. 13.
    Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
  14. 14.
    Mikolov, T., Kopecky, J., Burget, L., Glembek, O., et al.: Neural network based language models for highly inflective languages. In: 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2009, pp. 4725–4728. IEEE, Taipei (2009)Google Scholar
  15. 15.
    Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119. Lake Tahoe, Nevada (2013)Google Scholar
  16. 16.
    Müllensiefen, D., Frieler, K., et al.: Cognitive adequacy in the measurement of melodic similarity: algorithmic vs. human judgments. Comput. Musicology 13(2003), 147–176 (2004)Google Scholar
  17. 17.
    Nettl, B.: An ethnomusicologist contemplates universals in musical sound and musical culture. In: Brown, S., Nils, L., Wallin, B.M. (eds.) The Origins of Music, pp. 463–472. MIT Press, Cambridge (2000)Google Scholar
  18. 18.
    Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533 (1986)CrossRefGoogle Scholar
  19. 19.
    Savage, P.E., Brown, S., Sakai, E., Currie, T.E.: Statistical universals reveal the structures and functions of human music. Proc. National Acad. Sci. 112(29), 8987–8992 (2015)CrossRefGoogle Scholar
  20. 20.
    Schaffrath, H., Huron, D.: The essen folksong collection in the humdrum kern format. Technical report, Center for Computer Assisted Research in the Humanities, Menlo Park, CA, USA (1995)Google Scholar
  21. 21.
    Scherrer, D.K., Scherrer, P.H.: An experiment in the computer measurement of melodic variation in folksong. J. Am. Folklore 84(332), 230–241 (1971)CrossRefGoogle Scholar
  22. 22.
    Schnabel, T., Labutov, I., Mimno, D., Joachims, T.: Evaluation methods for unsupervised word embeddings. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 298–307. Lisbon, Portugal (2015)Google Scholar
  23. 23.
    Toiviainen, P., Eerola, T.: A computational model of melodic similarity based on multiple representations and self-organizing maps. In: Proceedings of the seventh international conference on music perception and cognition, Sydney. Causal Productions, Adelaide, pp. 236–239 (2002)Google Scholar
  24. 24.
    Turney, P.D., Pantel, P.: From frequency to meaning: vector space models of semantics. J. Artif. Intell. Res. 37, 141–188 (2010)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Aitor Arronte Alvarez
    • 1
  • Francisco Gómez-Martin
    • 2
    Email author
  1. 1.Center for Language and TechnologyUniversity of Hawaii at ManoaHonoluluUSA
  2. 2.Applied Mathematics DepartmentTechnical University of MadridMadridSpain

Personalised recommendations