Related Work and a Taxonomy of Musical Intelligence Tasks

  • Elad LiebmanEmail author
Part of the Studies in Computational Intelligence book series (SCI, volume 857)


In order to better understand the contributions of this book it is important to put them in context. Given the breadth and complexity of research work at the intersection of AI research and music informatics, a need arises to find useful ways of breaking down and organizing previous research. Indeed, in this section I will not only introduce a broad overview of the related literature, but also propose a unified framework for parsing this varied and complex body of work, delineating a taxonomy for music AI tasks, and mapping out the overall state of the art. In doing so, this chapter constitutes Contribution 1 of this book.


  1. 1.
    M. Duckham, L. Kulik, “Simplest” paths: automated route selection for navigation, in International Conference on Spatial Information Theory (Springer, Berlin, 2003), pp 169–185Google Scholar
  2. 2.
    M. Wolterman, Infrastructure-based collision warning using artificial intelligence, US Patent 7,317,406, 8 Jan 2008Google Scholar
  3. 3.
    G. Adomavicius, A. Tuzhilin, Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 17(6), 734–749 (2005)CrossRefGoogle Scholar
  4. 4.
    M. Zhao, S.-C. Zhu, Sisley the abstract painter, in Proceedings of the 8th International Symposium on Non-Photorealistic Animation and Rendering (ACM, New York, 2010), pp. 99–107Google Scholar
  5. 5.
    C. Doersch, S. Singh, A. Gupta, J. Sivic, A.A. Efros, What makes paris look like paris? ACM Trans. Graph. (TOG) 31(4), 101 (2012)CrossRefGoogle Scholar
  6. 6.
    T. Cour, B. Sapp, C. Jordan, B. Taskar. Learning from ambiguously labeled images, in IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2009 (IEEE, 2009), pp. 919–926Google Scholar
  7. 7.
    S. Argamon, M. Koppel, J. Fine, A.R. Shimoni, Gender, genre, and writing style in formal written texts. Text- Hague Then Amst Then Berl - 23(3), 321–346 (2003)Google Scholar
  8. 8.
    E. Stamatatos, A survey of modern authorship attribution methods. J. Assoc. Inf. Sci. Technol. 60(3), 538–556 (2009)CrossRefGoogle Scholar
  9. 9.
    M.G. Kirschenbaum, The remaking of reading: Data mining and the digital humanities, in The National Science Foundation Symposium on Next Generation of Data Mining and Cyber-Enabled Discovery for Innovation, Baltimore, MD (2007)Google Scholar
  10. 10.
    L.A. Hiller, L.M. Isaacson, Experimental Music: Composition with an Electronic Computer (Greenwood Publishing Group Inc., Westport, 1959)Google Scholar
  11. 11.
    I. Xenakis, Formalized Music: Thought and Mathematics in Composition, vol. 6 (Pendragon Press, Hillsdale, 1992)Google Scholar
  12. 12.
    I. Xenakis, Free stochastic music from the computer. Programme of stochastic music in fortran. Gravesaner Blätter, 26, 54–92 (1965)Google Scholar
  13. 13.
    G. Born, Rationalizing Culture: IRCAM, Boulez, and the Institutionalization of the Musical Avant-garde (University of California Press, Berkeley, 1995)Google Scholar
  14. 14.
    J. Anderson, A provisional history of spectral music. Contemp. Music. Rev. 19(2), 7–22 (2000)CrossRefGoogle Scholar
  15. 15.
    R.S. Jackendoff, Semantic Interpretation in Generative Grammar (The MIT Press, Cambridge, 1972)Google Scholar
  16. 16.
    E. Muñoz, J.M. Cadenas, Y.S. Ong, G. Acampora, Memetic music composition. IEEE Trans. Evol. Comput. 20(1), 1–15 (2016)CrossRefGoogle Scholar
  17. 17.
    D. Quick, Generating music using concepts from schenkerian analysis and chord spaces. Technical report, Yale University, 2010Google Scholar
  18. 18.
    S. Doraisamy, S. Golzari, N. Mohd, M.N. Sulaiman, N.I. Udzir, A study on feature selection and classification techniques for automatic genre classification of traditional malay music, in ISMIR (2008), pp. 331–336Google Scholar
  19. 19.
    A. Mardirossian, E. Chew, Music summarization via key distributions: analyses of similarity assessment across variations, in ISMIR (2006), pp. 234–239Google Scholar
  20. 20.
    B. Eric, N. De Freitas, “Name that song!” a probabilistic approach to querying on music and text, in Advances in Neural Information Processing Systems (2003), pp. 1529–1536Google Scholar
  21. 21.
    M. Pearce, D. Müllensiefen, G.A. Wiggins, A comparison of statistical and rule-based models of melodic segmentation, in ISMIR (2008), pp. 89–94Google Scholar
  22. 22.
    R. Chen, M. Li, Music structural segmentation by combining harmonic and timbral information, in ISMIR (2011), pp. 477–482Google Scholar
  23. 23.
    E. Liebman, E. Ornoy, B. Chor, A phylogenetic approach to music performance analysis. J. New Music. Res. 41(2), 195–222 (2012)CrossRefGoogle Scholar
  24. 24.
    D. Conklin, I.H. Witten, Multiple viewpoint systems for music prediction. J. New Music. Res. 24(1), 51–73 (1995)CrossRefGoogle Scholar
  25. 25.
    C.L. Krumhansl, Cognitive Foundations of Musical Pitch (Oxford University Press, Oxford, 2001)Google Scholar
  26. 26.
    S. Abdallah, M. Plumbley, Information dynamics: patterns of expectation and surprise in the perception of music. Connect. Sci. 21(2–3), 89–117 (2009)CrossRefGoogle Scholar
  27. 27.
    P.N. Juslin, D. Västfjäll, Emotional responses to music: the need to consider underlying mechanisms. Behav. Brain Sci. 31(5), 559–575 (2008)CrossRefGoogle Scholar
  28. 28.
    K. Dautenhahn, Getting to know each other-artificial social intelligence for autonomous robots. Robot. Auton. Syst. 16(2–4), 333–356 (1995)CrossRefGoogle Scholar
  29. 29.
    L.-J. Li, R. Socher, L. Fei-Fei, Towards total scene understanding: classification, annotation and segmentation in an automatic framework, in IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2009 (IEEE, 2009), pp. 2036–2043Google Scholar
  30. 30.
    S. Russell, P. Norvig, Artificial Intelligence. A Modern Approach, vol. 25 (Prentice-Hall, Englewood Cliffs, 1995), p. 27Google Scholar
  31. 31.
    A. Latham, The Oxford dictionary of musical terms (Oxford University Press, Oxford. 2004)Google Scholar
  32. 32.
    G. Loy, Musicians make a standard: the midi phenomenon. Comput. Music. J. 9(4), 8–26 (1985)CrossRefGoogle Scholar
  33. 33.
    M.A. Hearst, S.T. Dumais, E. Osuna, J. Platt, B. Scholkopf, Support vector machines. IEEE Intell. Syst. Appl. 13(4), 18–28 (1998)CrossRefGoogle Scholar
  34. 34.
    L.R. Rabiner, A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)CrossRefGoogle Scholar
  35. 35.
    M. Richardson, P. Domingos, Markov logic networks. Mach. Learn. 62(1–2), 107–136 (2006)CrossRefGoogle Scholar
  36. 36.
    J. Lafferty, A. McCallum, F.C. Pereira, Conditional random fields: probabilistic models for segmenting and labeling sequence data (2001)Google Scholar
  37. 37.
    D.M. Blei, A.Y. Ng, M.I. Jordan, Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)zbMATHGoogle Scholar
  38. 38.
    Y. LeCun, Y. Bengio et al., Convolutional networks for images, speech, and time series. Handb. Brain Theory Neural Netw. 3361(10), 1995 (1995)Google Scholar
  39. 39.
    K. Gurney, An Introduction to Neural Networks (CRC Press, London, 1997)Google Scholar
  40. 40.
    S. Hochreiter, J. Schmidhuber, Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRefGoogle Scholar
  41. 41.
    R. Davis, B. Buchanan, E. Shortliffe, Production rules as a representation for a knowledge-based consultation program. Artif. Intell. 8(1), 15–45 (1977)zbMATHCrossRefGoogle Scholar
  42. 42.
    J. Von Neumann, Probabilistic logic. California Institute Technology, 1952Google Scholar
  43. 43.
    L.A. Zadeh, Fuzzy logic and approximate reasoning. Synthese 30(3–4), 407–428 (1975)zbMATHCrossRefGoogle Scholar
  44. 44.
    B. Thom, Bob: an interactive improvisational music companion, in Proceedings of the Fourth International Conference on Autonomous Agents(ACM, New York, 2000), pp. 309–316Google Scholar
  45. 45.
    G. Hoffman, G. Weinberg, Interactive improvisation with a robotic marimba player. Auton. Robot. 31(2–3), 133–153 (2011)CrossRefGoogle Scholar
  46. 46.
    T. Blackwell, Swarm music: improvised music with multi-swarms, in Artificial Intelligence and the Simulation of Behaviour (University of Wales, 2003)Google Scholar
  47. 47.
    A. Cont, S. Dubnov, G. Assayag, Anticipatory model of musical style imitation using collaborative and competitive reinforcement learning, in Workshop on Anticipatory Behavior in Adaptive Learning Systems (Springer, Berlin, 2006), pp. 285–306Google Scholar
  48. 48.
    P. van Kranenburg, Assessing disputed attributions for organ fugues in the js bach (bwv) catalogue. Comput. Musicol. 15, 120–137 (2008)Google Scholar
  49. 49.
    A.M. Owen, The authorship of bach’s cantata no. 15. Music. Lett. 41(1), 28–32 (1960)Google Scholar
  50. 50.
    E. Scheirer, M. Slaney, Construction and evaluation of a robust multifeature speech/music discriminator, in 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing. ICASSP-97, vol. 2 (IEEE, 1997), pp. 1331–1334Google Scholar
  51. 51.
    J. Marques, P.J. Moreno, A study of musical instrument classification using gaussian mixture models and support vector machines. Cambridge Research Laboratory Technical Report Series CRL, 4 June 1999Google Scholar
  52. 52.
    P. Herrera, X. Amatriain, E. Batlle, X. Serra, Towards instrument segmentation for music content description: a critical review of instrument classification techniques, in International Symposium on Music Information Retrieval ISMIR, vol. 290 (2000)Google Scholar
  53. 53.
    K.D. Martin, Y.E. Kim, Musical instrument identification: a pattern-recognition approach (1998)Google Scholar
  54. 54.
    K.D. Martin, Sound-source recognition: a theory and computational model. Ph.D. thesis, Massachusetts Institute of Technology, 1999Google Scholar
  55. 55.
    J. Marques, An automatic annotation system for audio data containing music. Ph.D. thesis, Massachusetts Institute of Technology, 1999Google Scholar
  56. 56.
    M. Eichner, M. Wolff, R. Hoffmann, Instrument classification using hidden markov models. System 1(2), 3 (2006)Google Scholar
  57. 57.
    E. Benetos, M. Kotti, C. Kotropoulos, Musical instrument classification using non-negative matrix factorization algorithms and subset feature selection, in 2006 IEEE International Conference on Acoustics, Speech and Signal Processing. ICASSP 2006 Proceedings, vol. 5 (IEEE, 2006), pp. V–VGoogle Scholar
  58. 58.
    C. Joder, S. Essid, G. Richard, Temporal integration for audio classification with application to musical instrument classification. IEEE Trans. Audio Speech Lang. Process. 17(1), 174–186 (2009)CrossRefGoogle Scholar
  59. 59.
    A. Meng, P. Ahrendt, J. Larsen, L.K. Hansen, Temporal feature integration for music genre classification. IEEE Trans. Audio Speech Lang. Process. 15(5), 1654–1664 (2007)CrossRefGoogle Scholar
  60. 60.
    S. Garcıa-Dıez, M. Saerens, M. Senelle, F. Fouss, A simple-cycles weighted kernel based on harmony structure for similarity retrieval, in Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR)(2011), pp. 61–66Google Scholar
  61. 61.
    D. Fourer, J.-L. Rouas, P. Hanna, M. Robine, Automatic timbre classification of ethnomusicological audio recordings, in International Society for Music Information Retrieval Conference (ISMIR 2014) (2014)Google Scholar
  62. 62.
    S. Mika, G. Ratsch, J. Weston, B. Scholkopf, K.-R. Mullers, Fisher discriminant analysis with kernels, in Proceedings of the 1999 IEEE Signal Processing Society Workshop on Neural Networks for Signal Processing, vol. IX (IEEE, 1999), pp. 41–48Google Scholar
  63. 63.
    T. George, E. Georg, C. Perry, Automatic musical genre classification of audio signals, in Proceedings of the 2nd International Symposium on Music Information Retrieval, Indiana (2001)Google Scholar
  64. 64.
    S. Dubnov, G. Assayag, O. Lartillot, G. Bejerano, Using machine-learning methods for musical style modeling. Computer 36(10), 73–80 (2003)CrossRefGoogle Scholar
  65. 65.
    T. Li, M. Ogihara, Q. Li, A comparative study on content-based music genre classification, in Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval(ACM, New York, 2003), pp. 282–289Google Scholar
  66. 66.
    Y. Panagakis, C. Kotropoulos, G.R. Arce, Non-negative multilinear principal component analysis of auditory temporal modulations for music genre classification. IEEE Trans. Audio Speech Lang. Process. 18(3), 576–588 (2010)CrossRefGoogle Scholar
  67. 67.
    J. Salamon, B. Rocha, E. Gómez, Musical genre classification using melody features extracted from polyphonic music signals, in 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(IEEE, 2012), pp. 81–84Google Scholar
  68. 68.
    Y. Anan, K. Hatano, H. Bannai, M. Takeda, K. Satoh, Polyphonic music classification on symbolic data using dissimilarity functions, in ISMIR (2012), pp. 229–234Google Scholar
  69. 69.
    C.M. Marques, I.R. Guilherme, R.Y. Nakamura, J.P. Papa, New trends in musical genre classification using optimum-path forest, in ISMIR (2011), pp. 699–704Google Scholar
  70. 70.
    H. Rump, S. Miyabe, E. Tsunoo, N. Ono, S. Sagayama, Autoregressive mfcc models for genre classification improved by harmonic-percussion separation, in ISMIR (Citeseer, 2010), pp. 87–92Google Scholar
  71. 71.
    Y. Panagakis, C. Kotropoulos, G.R. Arce, Sparse multi-label linear embedding within nonnegative tensor factorization applied to music tagging, in ISMIR (2010), pp. 393–398Google Scholar
  72. 72.
    K. West, S. Cox. Features and classifiers for the automatic classification of musical audio signals, in ISMIR (2004)Google Scholar
  73. 73.
    T. Arjannikov, J.Z. Zhang, An association-based approach to genre classification in music, in ISMIR (2014), pp. 95–100Google Scholar
  74. 74.
    R. Hillewaere, B. Manderick, D. Conklin, String methods for folk tune genre classification, in ISMIR, vol. 2012 (2012), p. 13Google Scholar
  75. 75.
    R. Mayer, A. Rauber. Musical genre classification by ensembles of audio and lyrics features, in Proceedings of International Conference on Music Information Retrieval (2011), pp. 675–680Google Scholar
  76. 76.
    W. Herlands, R. Der, Y. Greenberg, S. Levin, A machine learning approach to musically meaningful homogeneous style classification, in Twenty-Eighth AAAI Conference on Artificial Intelligence (2014)Google Scholar
  77. 77.
    P. Hamel, M.E. Davies, K. Yoshii, M. Goto, Transfer learning in mir: sharing learned latent representations for music audio classification and similarity, in ISMIR (2013), pp. 9–14Google Scholar
  78. 78.
    A. Tellegen, D. Watson, L.A. Clark, On the dimensional and hierarchical structure of affect. Psychol. Sci. 10(4), 297–303 (1999)CrossRefGoogle Scholar
  79. 79.
    D. Yang, W.-S. Lee, Disambiguating music emotion using software agents. ISMIR 4, 218–223 (2004)Google Scholar
  80. 80.
    R.E. Thayer, The Biopsychology of Mood and Arousal (Oxford University Press, Oxford, 1990)Google Scholar
  81. 81.
    B.-j. Han, S. Ho, R.B. Dannenberg, E. Hwang, Smers: music emotion recognition using support vector regression (2009)Google Scholar
  82. 82.
    K. Trohidis, G. Tsoumakas, G. Kalliris, I.P. Vlahavas, Multi-label classification of music into emotions. ISMIR 8, 325–330 (2008)Google Scholar
  83. 83.
    Q. Lu, X. Chen, D. Yang, J. Wang, Boosting for multi-modal music emotion, in 11th International Society for Music Information and Retrieval Conference (2010), p. 105Google Scholar
  84. 84.
    M. Mann, T.J. Cox, F.F. Li, Music mood classification of television theme tunes, in ISMIR (2011), pp. 735–740Google Scholar
  85. 85.
    Y. Song, S. Dixon, M. Pearce, Evaluation of musical features for emotion classification, in ISMIR (2012), pp. 523–528Google Scholar
  86. 86.
    L. Su, L.-F. Yu, Y.-H. Yang, Sparse cepstral, phase codes for guitar playing technique classification, in ISMIR (2014), pp. 9–14Google Scholar
  87. 87.
    P. Toiviainen, T. Eerola, Classification of musical metre with autocorrelation and discriminant functions, in ISMIR (2005), pp. 351–357Google Scholar
  88. 88.
    M. Lagrange, A. Ozerov, E. Vincent, Robust singer identification in polyphonic music using melody enhancement and uncertainty-based learning, in 13th International Society for Music Information Retrieval Conference (ISMIR) (2012)Google Scholar
  89. 89.
    S. Abdoli, Iranian traditional music dastgah classification, in ISMIR (2011), pp. 275–280Google Scholar
  90. 90.
    K. Yoshii, M. Goto, K. Komatani, T. Ogata, H.G. Okuno, Hybrid collaborative and content-based music recommendation using probabilistic model with latent user preferences, in ISMIR (2006), vol. 6, p. 7Google Scholar
  91. 91.
    K. Yoshii, M. Goto, K. Komatani, T. Ogata, H.G. Okuno, Improving efficiency and scalability of model-based music recommender system based on incremental training, in ISMIR (2007), pp. 89–94Google Scholar
  92. 92.
    M. Tiemann, S. Pauws, F. Vignoli, Ensemble learning for hybrid music recommendation, in ISMIR (2007), pp. 179–180Google Scholar
  93. 93.
    D. Eck, T. Bertin-Mahieux, P. Lamere, Autotagging music using supervised machine learning, in ISMIR (2007), pp. 367–368Google Scholar
  94. 94.
    B. Horsburgh, S. Craw, S. Massie, Learning pseudo-tags to augment sparse tagging in hybrid music recommender systems. Artif. Intell. 219, 25–39 (2015)CrossRefGoogle Scholar
  95. 95.
    Y. Hu, M. Ogihara, Nextone player: a music recommendation system based on user behavior, in ISMIR (2011), pp. 103–108Google Scholar
  96. 96.
    Y. Hu, D. Li, M. Ogihara, Evaluation on feature importance for favorite song detection, in ISMIR (2013), pp. 323–328Google Scholar
  97. 97.
    Z. Xing, X. Wang, Y. Wang, Enhancing collaborative filtering music recommendation by balancing exploration and exploitation, in ISMIR (2014), pp. 445–450Google Scholar
  98. 98.
    P. Knees, M. Schedl, A survey of music similarity and recommendation from music context data. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 10(1), 2 (2013)Google Scholar
  99. 99.
    Y. Song, S. Dixon, M. Pearce, A survey of music recommendation systems and future perspectives, in 9th International Symposium on Computer Music Modeling and Retrieval (2012)Google Scholar
  100. 100.
    M. Betser, P. Collen, J.-B. Rault, Audio identification using sinusoidal modeling and application to jingle detection, in ISMIR (2007), pp. 139–142Google Scholar
  101. 101.
    M. Skalak, J. Han, B. Pardo, Speeding melody search with vantage point trees, in ISMIR (2008), pp. 95–100Google Scholar
  102. 102.
    R. Miotto, N. Orio. A music identification system based on chroma indexing and statistical modeling, in ISMIR(2008), pp. 301–306Google Scholar
  103. 103.
    J.-C. Wang, M.-C. Yen, Y.-H. Yang, H.-M. Wang, Automatic set list identification and song segmentation for full-length concert videos, in ISMIR (2014), pp. 239–244Google Scholar
  104. 104.
    P. Grosche, J. Serra, M. Müller, J.L. Arcos, Structure-based audio fingerprinting for music retrieval, in 13th International Society for Music Information Retrieval Conference (FEUP Edições, 2012), pp. 55–60Google Scholar
  105. 105.
    J. Foote, Visualizing music and audio using self-similarity, in Proceedings of the Seventh ACM International Conference on Multimedia (Part 1) (ACM, New York, 1999), pp. 77–80Google Scholar
  106. 106.
    M. Müller, F. Kurth, M. Clausen, Audio matching via chroma-based statistical features, in ISMIR, vol. 2005 (2005), p. 6Google Scholar
  107. 107.
    A. Bellet, J.F. Bernabeu, A. Habrard, M. Sebban, Learning discriminative tree edit similarities for linear classification-application to melody recognition. Neurocomputing 214, 155–161 (2016)CrossRefGoogle Scholar
  108. 108.
    J.C. Platt, Fast embedding of sparse similarity graphs, in Advances in Neural Information Processing Systems (2004), pp. 571–578Google Scholar
  109. 109.
    M. Slaney, K. Weinberger, W. White, Learning a metric for music similarity, in International Symposium on Music Information Retrieval (ISMIR) (2008)Google Scholar
  110. 110.
    B. McFee, G.R. Lanckriet, Heterogeneous embedding for subjective artist similarity, in ISMIR (2009), pp. 513–518Google Scholar
  111. 111.
    B. McFee, L. Barrington, G.R. Lanckriet, Learning similarity from collaborative filters, in ISMIR (2010), pp. 345–350Google Scholar
  112. 112.
    B. McFee, G.R. Lanckriet, Large-scale music similarity search with spatial trees, in ISMIR (2011), pp. 55–60Google Scholar
  113. 113.
    R. Stenzel, T. Kamps, Improving content-based similarity measures by training a collaborative model, in ISMIR (2005), pp. 264–271Google Scholar
  114. 114.
    L. Hofmann-Engl, Towards a cognitive model of melodic similarity, in ISMIR (2001)Google Scholar
  115. 115.
    A. Flexer, E. Pampalk, G. Widmer, Novelty detection based on spectral similarity of songs, in ISMIR (2005), pp. 260–263Google Scholar
  116. 116.
    M. Müller, M. Clausen, Transposition-invariant self-similarity matrices, in ISMIR (2007), pp. 47–50Google Scholar
  117. 117.
    M.D. Hoffman, D.M. Blei, P.R. Cook, Content-based musical similarity computation using the hierarchical dirichlet process, in ISMIR (2008), pp. 349–354Google Scholar
  118. 118.
    D. Schnitzer, A. Flexer, G. Widmer, M. Gasser, Islands of gaussians: the self organizing map and gaussian music similarity features (2010)Google Scholar
  119. 119.
    J.-C. Wang, H.-S. Lee, H.-M. Wang, S.-K. Jeng, Learning the similarity of audio music in bag-of-frames representation from tagged music data, in ISMIR (2011), pp. 85–90Google Scholar
  120. 120.
    T.E. Ahonen, K. Lemström, S. Linkola, Compression-based similarity measures in symbolic, polyphonic music, in ISMIR(Citeseer, 2011), pp. 91–96Google Scholar
  121. 121.
    M. Cebrián, M. Alfonseca, A. Ortega, The normalized compression distance is resistant to noise. IEEE Trans. Inf. Theory 53(5), 1895–1900 (2007)MathSciNetzbMATHCrossRefGoogle Scholar
  122. 122.
    Z. Fu, G. Lu, K.M. Ting, D. Zhang, A survey of audio-based music classification and annotation. IEEE Trans. Multimed. 13(2), 303–319 (2011)CrossRefGoogle Scholar
  123. 123.
    A. L. Berenzweig, D.P. Ellis, Locating singing voice segments within music signals, in 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (IEEE, 2001), pp. 119–122Google Scholar
  124. 124.
    G. Tomlinson, Musicology, anthropology, history, in The Cultural Study of Music (Routledge, London, 2012), pp. 81–94Google Scholar
  125. 125.
    B. Bel, B. Vecchione, Computational musicology. Comput. Humanit. 27(1), 1–5 (1993)CrossRefGoogle Scholar
  126. 126.
    J. Paulus, M. Müller, A. Klapuri, State of the art report: audio-based music structure analysis, in ISMIR (2010), pp. 625–636Google Scholar
  127. 127.
    F. Lerdahl, R.S. Jackendoff, A Generative Theory of Tonal Music (MIT press, Cambridge, 1985)Google Scholar
  128. 128.
    E. Batlle, P. Cano, Automatic segmentation for music classification using competitive hidden markov models (2000)Google Scholar
  129. 129.
    S. Harford, Automatic segmentation, learning and retrieval of melodies using a self-organizing neural network (2003)Google Scholar
  130. 130.
    A. Sheh, D.P. Ellis, Chord segmentation and recognition using em-trained hidden markov models (2003)Google Scholar
  131. 131.
    R. Parry, I. Essa, Feature weighting for segmentation, in Proceedings of ICMIR (2004), pp. 116–119Google Scholar
  132. 132.
    W. Liang, S. Zhang, B. Xu, A hierarchical approach for audio stream segmentation and classification, in ISMIR (2005), pp. 582–585Google Scholar
  133. 133.
    M. Müller, P. Grosche, F. Wiering, Robust segmentation and annotation of folk song recordings, in ISMIR (2009), pp. 735–740Google Scholar
  134. 134.
    T. Prätzlich, M. Müller. Freischütz digital: a case study for reference-based audio segmentation for operas, in ISMIR (2013), pp. 589–594Google Scholar
  135. 135.
    T. Prätzlich, M. Müller, Frame-level audio segmentation for abridged musical works, in ISMIR (2014), pp. 307–312Google Scholar
  136. 136.
    M. Marolt, Probabilistic segmentation and labeling of ethnomusicological field recordings, in ISMIR (2009)Google Scholar
  137. 137.
    M.E. Rodríguez López, A. Volk, D. Bountouridis, Multi-strategy segmentation of melodies, in Proceedings of the 15th Conference of the International Society for Music Information Retrieval (ISMIR 2014) (ISMIR Press, 2014), pp. 207–212Google Scholar
  138. 138.
    H. Lukashevich, I. Fraunhofer, Towards quantitative measures of evaluating song segmentation (2008), pp. 375–380Google Scholar
  139. 139.
    J.-F. Paiement, D. Eck, S. Bengio, A probabilistic model for chord progressions, in Proceedings of the Sixth International Conference on Music Information Retrieval (ISMIR), vol. EPFL-CONF-83178 (2005)Google Scholar
  140. 140.
    J.A. Burgoyne, L.K. Saul, Learning harmonic relationships in digital audio with Dirichlet-based hidden Markov models, in ISMIR (2005), pp. 438–443Google Scholar
  141. 141.
    M. Mauch, K. Noland, S. Dixon, Using musical structure to enhance automatic chord transcription, in ISMIR (2009), pp. 231–236Google Scholar
  142. 142.
    M. Ogihara, T. Li, N-gram chord profiles for composer style representation, in ISMIR (2008), pp. 671–676Google Scholar
  143. 143.
    K. Yoshii, M. Goto, Infinite latent harmonic allocation: a nonparametric bayesian approach to multipitch analysis, in ISMIR (2010), pp. 309–314Google Scholar
  144. 144.
    R. Chen, W. Shen, A. Srinivasamurthy, P. Chordia, Chord recognition using duration-explicit hidden markov models, in ISMIR (Citeseer, 2012), pp. 445–450Google Scholar
  145. 145.
    N. Boulanger-Lewandowski, Y. Bengio, P. Vincent, Audio chord recognition with recurrent neural networks, in ISMIR (Citeseer, 2013), pp. 335–340Google Scholar
  146. 146.
    E.J. Humphrey, J.P. Bello, Rethinking automatic chord recognition with convolutional neural networks, in 2012 11th International Conference on Machine Learning and Applications (ICMLA) (IEEE, 2012), vol. 2, pp. 357–362Google Scholar
  147. 147.
    X. Zhou, A. Lerch, Chord detection using deep learning, in Proceedings of the 16th ISMIR Conference, vol. 53 (2015)Google Scholar
  148. 148.
    P.O. Hoyer, Non-negative sparse coding, in Proceedings of the 2002 12th IEEE Workshop on Neural Networks for Signal Processing(IEEE, 2002), pp. 557–565Google Scholar
  149. 149.
    S.A. Abdallah, M.D. Plumbley, Polyphonic music transcription by non-negative sparse coding of power spectra, in 5th International Conference on Music Information Retrieval (ISMIR) (2004), pp. 318–325Google Scholar
  150. 150.
    T.B. Yakar, P. Sprechmann, R. Litman, A.M. Bronstein, G. Sapiro, Bilevel sparse models for polyphonic music transcription, in ISMIR (2013), pp. 65–70Google Scholar
  151. 151.
    S.T. Madsen, G. Widmer, Towards a computational model of melody identification in polyphonic music, in IJCAI (2007), pp. 459–464Google Scholar
  152. 152.
    G.E. Poliner, D.P. Ellis, A discriminative model for polyphonic piano transcription. EURASIP J. Adv. Signal Process. 2007(1), 048317 (2006)zbMATHCrossRefGoogle Scholar
  153. 153.
    Z. Duan, D. Temperley, Note-level music transcription by maximum likelihood sampling, in ISMIR (Citeseer, 2014), pp. 181–186Google Scholar
  154. 154.
    S. Jo, C.D. Yoo, Melody extraction from polyphonic audio based on particle filter, in ISMIR(Citeseer, 2010), pp. 357–362Google Scholar
  155. 155.
    E. Kapanci, A. Pfeffer, Signal-to-score music transcription using graphical models, in IJCAI (Citeseer, 2005), pp. 758–765Google Scholar
  156. 156.
    S. Raczynski, E. Vincent, F. Bimbot, S. Sagayama, Multiple pitch transcription using dbn-based musicological models, in 2010 Int. Society for Music Information Retrieval Conference (ISMIR) (2010), pp. 363–368Google Scholar
  157. 157.
    G. Grindlay, D.P. Ellis, A probabilistic subspace model for multi-instrument polyphonic transcription, in ISMIR (2010), pp. 21–26Google Scholar
  158. 158.
    N. Boulanger-Lewandowski, Y. Bengio, P. Vincent, Modeling temporal dependencies in high-dimensional sequences: application to polyphonic music generation and transcription (2012), arXiv:1206.6392
  159. 159.
    J. Nam, J. Ngiam, H. Lee, M. Slaney, A classification-based polyphonic piano transcription approach using learned feature representations, in ISMIR (2011), pp. 175–180Google Scholar
  160. 160.
    S. Böck, M. Schedl, Polyphonic piano note transcription with recurrent neural networks, in 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (IEEE, 2012), pp. 121–124Google Scholar
  161. 161.
    T. Berg-Kirkpatrick, J. Andreas, D. Klein, Unsupervised transcription of piano music, in Advances in Neural Information Processing Systems (2014), pp. 1538–1546Google Scholar
  162. 162.
    S. Sigtia, E. Benetos, S. Cherla, T. Weyde, A. Garcez, S. Dixon, in An rnn-based music language model for improving automatic music transcription (International Society for Music Information Retrieval, 2014), pp. 53–58,
  163. 163.
    D.P. Ellis, Identifying ‘cover songs’ with beat-synchronous chroma features, in MIREX 2006 (2006), p. 32Google Scholar
  164. 164.
    E. Gómez, Tonal description of music audio signals. Ph.D. thesis, Universitat Pompeu Fabra, 2006Google Scholar
  165. 165.
    K. Lee, Automatic chord recognition from audio using enhanced pitch class profile, in ICMC (2006)Google Scholar
  166. 166.
    T.F. Smith, M.S. Waterman, Comparison of biosequences. Adv. Appl. Math. 2(4), 482–489 (1981)MathSciNetzbMATHCrossRefGoogle Scholar
  167. 167.
    J. Serra, E. Gómez, P. Herrera, X. Serra, Chroma binary similarity and local alignment applied to cover song identification. IEEE Trans. Audio Speech Lang. Process. 16(6), 1138–1151 (2008)CrossRefGoogle Scholar
  168. 168.
    T. Bertin-Mahieux, D.P. Ellis, Large-scale cover song recognition using the 2d fourier transform magnitude, in ISMIR, pp. 241–246 (2012)Google Scholar
  169. 169.
    E.J. Humphrey, O. Nieto, J.P. Bello, Data driven and discriminative projections for large-scale cover song identification, in ISMIR (2013), pp. 149–154Google Scholar
  170. 170.
    C.J. Tralie, P. Bendich, Cover song identification with timbral shape sequences (2015)Google Scholar
  171. 171.
    W. You, R.B. Dannenberg, Polyphonic music note onset detection using semi-supervised learning, in ISMIR (2007), pp. 279–282Google Scholar
  172. 172.
    E. Benetos, A. Holzapfel, Y. Stylianou, Pitched instrument onset detection based on auditory spectra, in ISMIR (International Society for Music Information Retrieval, 2009), pp. 105–110Google Scholar
  173. 173.
    A. Holzapfel, Y. Stylianou, Beat tracking using group delay based onset detection, in ISMIR-International Conference on Music Information Retrieval (ISMIR, 2008), pp. 653–658Google Scholar
  174. 174.
    J.P. Bello, L. Daudet, S. Abdallah, C. Duxbury, M. Davies, M.B. Sandler, A tutorial on onset detection in music signals. IEEE Trans. Speech Audio Process. 13(5), 1035–1047 (2005)CrossRefGoogle Scholar
  175. 175.
    J. Schluter, S. Bock, Improved musical onset detection with convolutional neural networks, in 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (IEEE, 2014), pp. 6979–6983Google Scholar
  176. 176.
    C. Raphael, Automated rhythm transcription, in ISMIR, vol. 2001 (2001), pp. 99–107Google Scholar
  177. 177.
    M.A. Alonso, G. Richard, B. David, Tempo and beat estimation of musical signals, in ISMIR (2004)Google Scholar
  178. 178.
    J. Paulus, A. Klapuri, Combining temporal and spectral features in hmm-based drum transcription, in ISMIR (2007), pp. 225–228Google Scholar
  179. 179.
    O. Gillet, G. Richard, Supervised and unsupervised sequence modelling for drum transcription, in ISMIR (2007), pp. 219–224Google Scholar
  180. 180.
    M. Le Coz, H. Lachambre, L. Koenig, R. Andre-Obrecht, A segmentation-based tempo induction method, in ISMIR (2010), pp. 27–32Google Scholar
  181. 181.
    R. Andre-Obrecht, A new statistical approach for the automatic segmentation of continuous speech signals. IEEE Trans. Acoust. Speech Signal Process. 36(1), 29–40 (1988)CrossRefGoogle Scholar
  182. 182.
    S. Dixon, An on-line time warping algorithm for tracking musical performances, in IJCAI (2005), pp. 1727–1728Google Scholar
  183. 183.
    D.J. Berndt, J. Clifford, Using dynamic time warping to find patterns in time series, in KDD Workshop, vol. 10 (Seattle, WA, 1994), pp. 359–370Google Scholar
  184. 184.
    B. Pardo, W. Birmingham, Modeling form for on-line following of musical performances, in Proceedings of the National Conference on Artificial Intelligence (AAAI Press, Menlo Park; MIT Press, Cambridge, 1999, 2005), vol. 20, p. 1018Google Scholar
  185. 185.
    A.E. Coca, L. Zhao, Musical rhythmic pattern extraction using relevance of communities in networks. Inf. Sci. 329, 819–848 (2016)CrossRefGoogle Scholar
  186. 186.
    G. Peeters, Sequence representation of music structure using higher-order similarity matrix and maximum-likelihood approach, in ISMIR (2007), pp. 35–40Google Scholar
  187. 187.
    M. Müller, S. Ewert, Joint structure analysis with applications to music annotation and synchronization, in ISMIR (2008), pp. 389–394Google Scholar
  188. 188.
    M. Bergeron, D. Conklin, Structured polyphonic patterns, in ISMIR (2008), pp. 69–74Google Scholar
  189. 189.
    F. Kaiser, T. Sikora, Music structure discovery in popular music using non-negative matrix factorization, in ISMIR (2010), pp. 429–434Google Scholar
  190. 190.
    J. Madsen, B. Sand Jensen, J. Larsen. Modeling temporal structure in music for emotion prediction using pairwise comparisons (2014)Google Scholar
  191. 191.
    Z. Juhász, Motive identification in 22 folksong corpora using dynamic time warping and self organizing map, in ISMIR (2009), pp. 171–176Google Scholar
  192. 192.
    O. Lartillot, Efficient extraction of closed motivic patterns in multi-dimensional symbolic representations of music, in Proceedings of the 2005 IEEE/WIC/ACM International Conference on Web Intelligence (IEEE, 2005), pp. 229–235Google Scholar
  193. 193.
    T. Collins, Discovery of repeated themes and sections, Accessed 4 May 2013
  194. 194.
    S.T. Madsen, G. Widmer, Exploring pianist performance styles with evolutionary string matching. Int. J. Artif. Intell. Tools 15(04), 495–513 (2006)CrossRefGoogle Scholar
  195. 195.
    C.S. Sapp, Comparative analysis of multiple musical performances, in ISMIR, pp. 497–500 (2007)Google Scholar
  196. 196.
    M. Molina-Solana, J.L. Arcos, E. Gomez, Using expressive trends for identifying violin performers, in ISMIR (2008), pp. 495–500Google Scholar
  197. 197.
    K. Okumura, S. Sako, T. Kitamura, Stochastic modeling of a musical performance with expressive representations from the musical score, in ISMIR (Citeseer, 2011), pp. 531–536Google Scholar
  198. 198.
    S. Van Herwaarden, M. Grachten, W.B. De Haas, Predicting expressive dynamics in piano performances using neural networks, in Proceedings of the 15th Conference of the International Society for Music Information Retrieval (ISMIR 2014) (International Society for Music Information Retrieval, 2014), pp. 45–52Google Scholar
  199. 199.
    G. Nierhaus, Algorithmic Composition: Paradigms of Automated Music Generation (Springer Science & Business Media, Vienna, 2009)Google Scholar
  200. 200.
    F. Maillet, D. Eck, G. Desjardins, P. Lamere et al., Steerable playlist generation by learning song similarity from radio station playlists, in ISMIR (2009), pp. 345–350Google Scholar
  201. 201.
    B. McFee, G.R. Lanckriet, The natural language of playlists, in ISMIR (2011), pp. 537–542Google Scholar
  202. 202.
    S. Chen, J.L. Moore, D. Turnbull, T. Joachims, Playlist prediction via metric embedding, in Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (ACM, New York, 2012), pp. 714–722Google Scholar
  203. 203.
    E. Zheleva, J. Guiver, E. Mendes Rodrigues, N. Milić-Frayling, Statistical models of music-listening sessions in social media, in Proceedings of the 19th International Conference on World Wide Web (ACM, New York, 2010), pp. 1019–1028Google Scholar
  204. 204.
    E. Liebman, M. Saar-Tsechansky, P. Stone, Dj-mc: a reinforcement-learning agent for music playlist recommendation, in Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems (International Foundation for Autonomous Agents and Multiagent Systems, 2015), pp. 591–599Google Scholar
  205. 205.
    X. Wang, Y. Wang, D. Hsu, Y. Wang, Exploration in interactive personalized music recommendation: a reinforcement learning approach (2013), arXiv:1311.6355
  206. 206.
    B. Logan, A. Salomon, A music similarity function based on signal analysis, in ICME (2001), pp. 22–25Google Scholar
  207. 207.
    B. Logan, Content-based playlist generation: exploratory experiments, in ISMIR (2002)Google Scholar
  208. 208.
    A. Lehtiniemi, Evaluating supermusic: streaming context-aware mobile music service, in Proceedings of the 2008 International Conference on Advances in Computer Entertainment Technology (ACM, New York, 2008), pp. 314–321Google Scholar
  209. 209.
    M. Taramigkou, E. Bothos, K. Christidis, D. Apostolou, G. Mentzas, Escape the bubble: guided exploration of music preferences for serendipity and novelty, in Proceedings of the 7th ACM Conference on Recommender Systems (ACM, New York, 2013), pp. 335–338Google Scholar
  210. 210.
    R.L. De Mantaras, J.L. Arcos, Ai and music: from composition to expressive performance. AI Mag. 23(3), 43 (2002)Google Scholar
  211. 211.
    R. Ramirez, A. Hazan, A tool for generating and explaining expressive music performances of monophonic jazz melodies. Int. J. Artif. Intell. Tools 15(04), 673–691 (2006)CrossRefGoogle Scholar
  212. 212.
    R. Ramirez, A. Hazan, Inducing a generative expressive performance model using a sequential-covering genetic algorithm, in Proceedings of the 9th Annual Conference on Genetic and Evolutionary Computation (ACM, New York, 2007), pp. 2159–2166Google Scholar
  213. 213.
    D. Diakopoulos, O. Vallis, J. Hochenbaum, J.W. Murphy, A. Kapur, 21st century electronica: Mir techniques for classification and performance, in ISMIR (2009), pp. 465–470Google Scholar
  214. 214.
    K. Murata, K. Nakadai, K. Yoshii, R. Takeda, T. Torii, H.G. Okuno, Y. Hasegawa, H. Tsujino, A robot singer with music recognition based on real-time beat tracking, in ISMIR (2008), pp. 199–204Google Scholar
  215. 215.
    G. Xia, J. Tay, R. Dannenberg, M. Veloso, Autonomous robot dancing driven by beats and emotions of music, in Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems, vol. 1 (International Foundation for Autonomous Agents and Multiagent Systems, 2012), pp. 205–212Google Scholar
  216. 216.
    D. Eck, J. Schmidhuber, Finding temporal structure in music: Blues improvisation with lstm recurrent networks, in Proceedings of the 2002 12th IEEE Workshop on Neural Networks for Signal Processing (IEEE, 2002), pp. 747–756Google Scholar
  217. 217.
    B. Thom, Machine learning techniques for real-time improvisational solo trading, in ICMC (2001)Google Scholar
  218. 218.
    B. Thom, Unsupervised learning and interactive jazz/blues improvisation, in AAAI/IAAI, pp. 652–657 (2000)Google Scholar
  219. 219.
    G. Assayag, S. Dubnov, Using factor oracles for machine improvisation. Soft Comput. Fusion Found. Methodol. Appl. 8(9), 604–610 (2004)Google Scholar
  220. 220.
    K. Kosta, M. Marchini, H. Purwins, Unsupervised chord-sequence generation from an audio example, in ISMIR (2012), pp. 481–486Google Scholar
  221. 221.
    F. Colombo, S.P. Muscinelli, A. Seeholzer, J. Brea, W. Gerstner, Algorithmic composition of melodies with deep recurrent neural networks (2016), arXiv:1606.07251
  222. 222.
    S. Dieleman, A. van den Oord, K. Simonyan, The challenge of realistic music generation: modelling raw audio at scale, in Advances in Neural Information Processing Systems (2018), pp. 7999–8009Google Scholar
  223. 223.
    C.A. Huang, A. Vaswani, J. Uszkoreit, N. Shazeer, C. Hawthorne, A.M. Dai, M.D. Hoffman, D. Eck, An improved relative self-attention mechanism for transformer with application to music generation (2018), arXiv:1809.04281
  224. 224.
    J.D. Fernández, F. Vico, Ai methods in algorithmic composition: a comprehensive survey. J. Artif. Intell. Res. 48, 513–582 (2013)MathSciNetCrossRefGoogle Scholar
  225. 225.
    R.B. Dannenberg, Music representation issues, techniques, and systems. Comput. Music. J. 17(3), 20–30 (1993)CrossRefGoogle Scholar
  226. 226.
    D. Rizo, P.J. Ponce de León, C. Pérez-Sancho, A. Pertusa, J. Iñesta, A pattern recognition approach for melody track selection in midi files (2006)Google Scholar
  227. 227.
    A.R.H. Yeshurun, Midi music genre classification by invariant features (2006)Google Scholar
  228. 228.
    L.-C. Yang, S.-Y. Chou, Y.-H. Yang, Midinet: a convolutional generative adversarial network for symbolic-domain music generation using 1d and 2d conditions (2017), arXiv:1703.10847
  229. 229.
    P. Grosche, M. Müller, C.S. Sapp, What makes beat tracking difficult? A case study on chopin mazurkas, in ISMIR (2010), pp. 649–654Google Scholar
  230. 230.
    A. Rauber, E. Pampalk, D. Merkl, Using psycho-acoustic models and self-organizing maps to create a hierarchical structuring of music by sound similarity (2002)Google Scholar
  231. 231.
    R. Hillewaere, B. Manderick, D. Conklin, String quartet classification with monophonic models, in ISMIR (2010), pp. 537–542Google Scholar
  232. 232.
    W.-H. Tsai, H.-M. Yu, H.-M. Wang et al., Query-by-example technique for retrieving cover versions of popular songs with similar melodies. ISMIR 5, 183–190 (2005)Google Scholar
  233. 233.
    H.-W. Nienhuys, J. Nieuwenhuizen, Lilypond, a system for automated music engraving, in Proceedings of the XIV Colloquium on Musical Informatics (XIV CIM 2003), vol. 1 (2003), pp. 167–171Google Scholar
  234. 234.
    D. Huron, Music information processing using the humdrum toolkit: concepts, examples, and lessons. Comput. Music. J. 26(2), 11–26 (2002)CrossRefGoogle Scholar
  235. 235.
    C.S. Sapp, Online database of scores in the humdrum file format, in ISMIR (2005), pp. 664–665Google Scholar
  236. 236.
    M. Good, Musicxml for notation and analysis. Virtual Score: Represent. Retr. Restor. 12, 113–124 (2001)Google Scholar
  237. 237.
    S. Sinclair, M. Droettboom, I. Fujinaga, Lilypond for pyscore: approaching a universal translator for music notation, in ISMIR (2006), pp. 387–388Google Scholar
  238. 238.
    M.S. Cuthbert, C. Ariza, L. Friedland, Feature extraction and machine learning on symbolic music using the music21 toolkit, in ISMIR (2011), pp. 387–392Google Scholar
  239. 239.
    C. Antila, J. Cumming, The vis framework: analyzing counterpoint in large datasets, in ISMIR (2014), pp. 71–76Google Scholar
  240. 240.
    D. Pye, Content-based methods for the management of digital music, in Proceedings of the 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP’00, vol. 4 (IEEE, 2000), pp. 2437–2440Google Scholar
  241. 241.
    D.G. Lowe, Object recognition from local scale-invariant features, in The Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2 (IEEE, 1999), pp. 1150–1157Google Scholar
  242. 242.
    N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2005, vol. 1 (IEEE, 2005), pp. 886–893Google Scholar
  243. 243.
    B. Logan et al., Mel frequency cepstral coefficients for music modeling. ISMIR 270, 1–11 (2000)Google Scholar
  244. 244.
    M.R. Hasan, M. Jamil, M. Rahman et al., Speaker identification using mel frequency cepstral coefficients. Variations 1(4) (2004)Google Scholar
  245. 245.
    P. Proutskova, M.A. Casey, You call that singing? ensemble classification for multi-cultural collections of music recordings, in ISMIR (Citeseer, 2009), pp. 759–764Google Scholar
  246. 246.
    B.W. Schuller, C. Kozielski, F. Weninger, F. Eyben, G. Rigoll et al., Vocalist gender recognition in recorded popular music, in ISMIR (2010), pp. 613–618Google Scholar
  247. 247.
    B. Tomasik, J.H. Kim, M. Ladlow, M. Augat, D. Tingle, R. Wicentowski, D. Turnbull, Using regression to combine data sources for semantic music discovery, in ISMIR (2009), pp. 405–410Google Scholar
  248. 248.
    Y. Han, K. Lee, Hierarchical approach to detect common mistakes of beginner flute players, in ISMIR (2014), pp. 77–82Google Scholar
  249. 249.
    M. Marolt, Probabilistic segmentation and labeling of ethnomusicological field recordings, in ISMIR (2009), pp. 75–80Google Scholar
  250. 250.
    D.P. Ellis, G.E. Poliner, Identifyingcover songs’ with chroma features and dynamic programming beat tracking, in IEEE International Conference on Acoustics, Speech and Signal Processing. ICASSP 2007, vol. 4 (IEEE, 2007), pp. IV–1429Google Scholar
  251. 251.
    Y. LeCun, Y. Bengio, G. Hinton, Deep learning. Nature 521(7553), 436 (2015)CrossRefGoogle Scholar
  252. 252.
    H. Lee, P. Pham, Y. Largman, A.Y. Ng, Unsupervised feature learning for audio classification using convolutional deep belief networks, in Advances in Neural Information Processing Systems (2009), pp. 1096–1104Google Scholar
  253. 253.
    P. Hamel, D. Eck, Learning features from music audio with deep belief networks, ISMIR, vol. 10 (Utrecht, The Netherlands, 2010), pp. 339–344Google Scholar
  254. 254.
    M. Henaff, K. Jarrett, K. Kavukcuoglu, Y. LeCun, Unsupervised learning of sparse features for scalable audio classification, in ISMIR, vol. 11 (2011), p. 2011Google Scholar
  255. 255.
    E.J. Humphrey, J.P. Bello, Y. LeCun, Moving beyond feature design: Deep architectures and automatic feature learning in music informatics, in ISMIR (Citeseer, 2012), pp. 403–408Google Scholar
  256. 256.
    C. Xu, N.C. Maddage, X. Shao, F. Cao, Q. Tian, Musical genre classification using support vector machines, in Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’03), vol. 5 (IEEE, 2003), pp. V–429Google Scholar
  257. 257.
    M.I. Mandel, D.P. Ellis, Multiple-instance learning for music information retrieval, in ISMIR (2008), pp. 577–582Google Scholar
  258. 258.
    J. Shawe-Taylor, A. Meng, An investigation of feature models for music genre classification using the support vector classifier (2005)Google Scholar
  259. 259.
    M. Helen, T. Virtanen, Separation of drums from polyphonic music using non-negative matrix factorization and support vector machine, in 2005 13th European Signal Processing Conference (IEEE, 2005), pp. 1–4Google Scholar
  260. 260.
    S.R. Ness, A. Theocharis, G. Tzanetakis, L.G. Martins, Improving automatic music tag annotation using stacked generalization of probabilistic svm outputs, in Proceedings of the 17th ACM International Conference on Multimedia (ACM, New York, 2009), pp. 705–708Google Scholar
  261. 261.
    N.C. Maddage, C. Xu, Y. Wang, A svm c based classification approach to musical audio (2003)Google Scholar
  262. 262.
    M. Gruhne, K. Schmidt, C. Dittmar, Detecting phonemes within the singing of polyphonic music, in Proceedings of ICoMCS December (2007), p. 60Google Scholar
  263. 263.
    A.S. Durey, M.A. Clements, Melody spotting using hidden markov models, in ISMIR (2001)Google Scholar
  264. 264.
    K.C. Noland, M.B. Sandler, Key estimation using a hidden markov model, in ISMIR (2006), pp. 121–126Google Scholar
  265. 265.
    H. Papadopoulos, G. Tzanetakis, Modeling chord and key structure with markov logic, in ISMIR (2012), pp. 127–132Google Scholar
  266. 266.
    D. Morris, I. Simon, S. Basu, Exposing parameters of a trained dynamic model for interactive music creation, in AAAI (2008), pp. 784–791Google Scholar
  267. 267.
    E. Nakamura, P. Cuvillier, A. Cont, N. Ono, S. Sagayama, Autoregressive hidden semi-markov model of symbolic music performance for score following, in 16th International Society for Music Information Retrieval Conference (ISMIR) (2015)Google Scholar
  268. 268.
    E. Nakamura, N. Ono, S. Sagayama, Merged-output hmm for piano fingering of both hands, in ISMIR (2014), pp. 531–536Google Scholar
  269. 269.
    P. Jancovic, M. Köküer, W. Baptiste, Automatic transcription of ornamented irish traditional flute music using hidden markov models, in ISMIR (2015), pp. 756–762Google Scholar
  270. 270.
    C. Raphael, A graphical model for recognizing sung melodies, in ISMIR (2005), pp. 658–663Google Scholar
  271. 271.
    C. Raphael, A hybrid graphical model for aligning polyphonic audio with musical scores, in ISMIR (2004), pp. 387–394Google Scholar
  272. 272.
    J. Pickens, C.S. Iliopoulos, Markov random fields and maximum entropy modeling for music information retrieval, in ISMIR (2005), pp. 207–214Google Scholar
  273. 273.
    D. Hu, L.K. Saul, A probabilistic topic model for unsupervised learning of musical key-profiles, in ISMIR(Citeseer, 2009), pp. 441–446Google Scholar
  274. 274.
    E.M. Schmidt, Y.E. Kim, Modeling musical emotion dynamics with conditional random fields, in ISMIR, pp. 777–782 (2011)Google Scholar
  275. 275.
    E. Schmidt, Y. Kim, Learning rhythm and melody features with deep belief networks, in ISMIR, pp. 21–26 (2013)Google Scholar
  276. 276.
    R. Manzelli, V. Thakkar, A. Siahkamari, B. Kulis, Conditioning deep generative raw audio models for structured automatic music (2018), arXiv:1806.09905
  277. 277.
    R.B. Dannenberg, B. Thom, D. Watson, A machine learning approach to musical style recognition (1997)Google Scholar
  278. 278.
    F.J. Kiernan, Score-based style recognition using artificial neural networks, in ISMIR (2000)Google Scholar
  279. 279.
    N. Griffith, P.M. Todd et al., Musical Networks: Parallel Distributed Perception and Performance (MIT Press, Cambridge, 1999)Google Scholar
  280. 280.
    S. Böck, F. Krebs, G. Widmer, Joint beat and downbeat tracking with recurrent neural networks, in ISMIR (2016), pp. 255–261Google Scholar
  281. 281.
    F. Krebs, S. Böck, M. Dorfer, G. Widmer, Downbeat tracking using beat synchronous features with recurrent neural networks, in ISMIR (2016), pp. 129–135Google Scholar
  282. 282.
    K. Choi, G. Fazekas, M. Sandler, Automatic tagging using deep convolutional neural networks (2016), arXiv:1606.00298
  283. 283.
    R. Vogl, M. Dorfer, P. Knees, Recurrent neural networks for drum transcription, in ISMIR, pp. 730–736 (2016)Google Scholar
  284. 284.
    I.-T. Liu, R. Randall, Predicting missing music components with bidirectional long short-term memory neural networks, in ISMIR (2016), pp. 225–231Google Scholar
  285. 285.
    D. Liang, M. Zhan, D.P. Ellis, Content-aware collaborative music recommendation using pre-trained neural networks, in ISMIR (2015), pp. 295–301, 2015Google Scholar
  286. 286.
    A. Van den Oord, S. Dieleman, B. Schrauwen, Deep content-based music recommendation, in Advances in Neural Information Processing Systems (2013), pp. 2643–2651Google Scholar
  287. 287.
    S. Durand, S. Essid, Downbeat detection with conditional random fields and deep learned features, in ISMIR (2016), pp. 386–392Google Scholar
  288. 288.
    H.-W. Dong, W.-Y. Hsiao, L.-C. Yang, Y.-H. Yang, Musegan: multi-track sequential generative adversarial networks for symbolic music generation and accompaniment, in Proceedings of AAAI Conference on Artificial Intelligence (2018)Google Scholar
  289. 289.
    Y. Panagakis, C. Kotropoulos, G.R. Arce, Sparse multi label linear embedding nonnegative tensor factorization for automatic music tagging, in Eighteenth European Signal Processing Conference (2010), pp. 492–496Google Scholar
  290. 290.
    T. Masuda, K. Yoshii, M. Goto, S. Morishima, Spotting a query phrase from polyphonic music audio signals based on semi-supervised nonnegative matrix factorization, in ISMIR (2014), pages 227–232Google Scholar
  291. 291.
    D. Liang, M.D. Hoffman, D.P. Ellis, Beta process sparse nonnegative matrix factorization for music, in ISMIR (2013), pp. 375–380Google Scholar
  292. 292.
    D. Liang, J. Paisley, D. Ellis et al., Codebook-based scalable music tagging with poisson matrix factorization, in ISMIR (Citeseer, 2014), pp. 167–172Google Scholar
  293. 293.
    R. Basili, A. Serafini, A. Stellato, Classification of musical genre: a machine learning approach, in ISMIR (2004)Google Scholar
  294. 294.
    Y. Lavner, D. Ruinskiy, A decision-tree-based algorithm for speech/music classification and segmentation. EURASIP J. Audio Speech Music. Process. 2009, 2 (2009)CrossRefGoogle Scholar
  295. 295.
    P. Herrera-Boyer, G. Peeters, S. Dubnov, Automatic classification of musical instrument sounds. J. New Music. Res. 32(1), 3–21 (2003)CrossRefGoogle Scholar
  296. 296.
    K. West, S. Cox, Finding an optimal segmentation for audio genre classification, in ISMIR (2005), pp. 680–685Google Scholar
  297. 297.
    S. Dupont, T. Ravet, Improved audio classification using a novel non-linear dimensionality reduction ensemble approach, in ISMIR (Citeseer, 2013), pp. 287–292Google Scholar
  298. 298.
    N. Casagrande, D. Eck, B. Kégl, Frame-level audio feature extraction using adaboost, in ISMIR (2005), pp. 345–350Google Scholar
  299. 299.
    D. Turnbull, G.R. Lanckriet, E. Pampalk, M. Goto, A supervised approach for detecting boundaries in music using difference features and boosting, in ISMIR (2007), pp. 51–54Google Scholar
  300. 300.
    C.L. Parker, Applications of binary classification and adaptive boosting to the query-by-humming problem, in ISMIR (2005), pp. 245–251Google Scholar
  301. 301.
    R. Foucard, S. Essid, M. Lagrange, G. Richard et al., Multi-scale temporal fusion by boosting for music classification, in ISMIR (2011), pp. 663–668Google Scholar
  302. 302.
    A. Anglade, R. Ramirez, S. Dixon et al., Genre classification using harmony rules induced from automatic chord transcriptions, in ISMIR (2009), pp. 669–674Google Scholar
  303. 303.
    N. Tokui, H. Iba et al., Music composition with interactive evolutionary computation, in Proceedings of the 3rd International Conference on Generative Art, vol. 17, pp. 215–226 (2000)Google Scholar
  304. 304.
    J.A. Biles, Improvizing with genetic algorithms: Genjam, in Evolutionary Computer Music (Springer, Berlin, 2007), pp. 137–169Google Scholar
  305. 305.
    M. Rohrmeier, A generative grammar approach to diatonic harmonic structure, in Proceedings of the 4th Sound and Music Computing Conference (2007), pp. 97–100Google Scholar
  306. 306.
    W.B. De Haas, M. Rohrmeier, R.C. Veltkamp, F. Wiering, Modeling harmonic similarity using a generative grammar of tonal harmony, in Proceedings of the Tenth International Conference on Music Information Retrieval (ISMIR) (2009)Google Scholar
  307. 307.
    J. McCormack, Grammar based music composition. Complex Syst. 96, 321–336 (1996)Google Scholar
  308. 308.
    F. Pachet, P. Roy, Musical harmonization with constraints: a survey. Constraints 6(1), 7–19 (2001)MathSciNetzbMATHCrossRefGoogle Scholar
  309. 309.
    S. Franklin, A. Graesser, Is it an agent, or just a program?: a taxonomy for autonomous agents, in International Workshop on Agent Theories, Architectures, and Languages (Springer, Berlin, 1996), pp. 21–35Google Scholar
  310. 310.
    J. Solis, A. Takanishi, K. Hashimoto, Development of an anthropomorphic saxophone-playing robot, in Brain, Body and Machine (Springer, Berlin, 2010), pp. 175–186Google Scholar
  311. 311.
    K. Petersen, J. Solis, A. Takanishi, Musical-based interaction system for the waseda flutist robot. Auton. Robot. 28(4), 471–488 (2010)CrossRefGoogle Scholar
  312. 312.
    C. Raphael, Demonstration of music plus one-a real-time system for automatic orchestral accompaniment, in AAAI (2006), pp. 1951–1952Google Scholar
  313. 313.
    M. Bretan, G. Weinberg, A survey of robotic musicianship. Commun. ACM 59(5), 100–109 (2016)CrossRefGoogle Scholar
  314. 314.
    A. Albin, G. Weinberg, M. Egerstedt, Musical abstractions in distributed multi-robot systems. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE, 2012), pp. 451–458Google Scholar
  315. 315.
    X. Wang, Y. Wang, D. Hsu, Y. Wang, Exploration in interactive personalized music recommendation: a reinforcement learning approach. ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 11(1), 7 (2014)Google Scholar
  316. 316.
    M. Dorfer, F. Henkel, G. Widmer, Learning to listen, read, and follow: score following as a reinforcement learning game (2018), arXiv:1807.06391
  317. 317.
    K. Murphy, Machine Learning: A Probabilistic Approach (Massachusetts Institute of Technology, 2012), pp. 1–21Google Scholar
  318. 318.
    L. Reboursière, O. Lähdeoja, T. Drugman, S. Dupont, C. Picard-Limpens, N. Riche, Left and right-hand guitar playing techniques detection, in NIME (2012)Google Scholar
  319. 319.
    N. Cook, Performance analysis and chopin’s mazurkas. Music. Sci. 11(2), 183–207 (2007)CrossRefGoogle Scholar
  320. 320.
    J.S. Downie, The music information retrieval evaluation exchange (2005–2007): a window into music information retrieval research. Acoust. Sci. Technol. 29(4), 247–255 (2008)CrossRefGoogle Scholar
  321. 321.
    J.H. Lee, Crowdsourcing music similarity judgments using mechanical turk, in ISMIR (2010), pp. 183–188Google Scholar
  322. 322.
    J. Weston, S. Bengio, P. Hamel, Multi-tasking with joint semantic spaces for large-scale music annotation and retrieval. J. New Music. Res. 40(4), 337–348 (2011)CrossRefGoogle Scholar
  323. 323.
    S. Craw, B. Horsburgh, S. Massie, Music recommenders: user evaluation without real users? AAAI/International Joint Conferences on Artificial Intelligence (IJCAI) (2015)Google Scholar
  324. 324.
    M. Ramona, G. Cabral, F. Pachet, Capturing a musician’s groove: generation of realistic accompaniments from single song recordings, in IJCAI (2015), pp. 4140–4142Google Scholar
  325. 325.
    T. Otsuka, K. Nakadai, T. Ogata, H.G. Okuno, Incremental bayesian audio-to-score alignment with flexible harmonic structure models, in ISMIR (2011), pp. 525–530Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of TexasAustinUSA

Personalised recommendations