Abstract
Modelling the real world complexity of music is a challenge for machine learning. We address the task of modeling melodic sequences from the same music genre. We perform a comparative analysis of two probabilistic models; a Dirichlet Variable Length Markov Model (Dirichlet-VMM) and a Time Convolutional Restricted Boltzmann Machine (TC-RBM). We show that the TC-RBM learns descriptive music features, such as underlying chords and typical melody transitions and dynamics. We assess the models for future prediction and compare their performance to a VMM, which is the current state of the art in melody generation. We show that both models perform significantly better than the VMM, with the Dirichlet-VMM marginally outperforming the TC-RBM. Finally, we evaluate the short order statistics of the models, using the Kullback-Leibler divergence between test sequences and model samples, and show that our proposed methods match the statistics of the music genre significantly better than the VMM.
Chapter PDF
Similar content being viewed by others
Keywords
References
Ackley, D.H., Hinton, G.E., Sejnowski, T.J.: A learning algorithm for Boltzmann machines. Cognitive Science 9(1), 147–169 (1985)
Dubnov, S., Assayag, G., Lartillot, O., Bejerano, G.: Using machine-learning methods for musical style modeling. Computer 36(10), 73–80 (2003)
Eck, D., Lapalme, J.: Learning musical structure directly from sequences of music. Technical report, Université de Montreal (2008)
Eck, D., Schmidhuber, J.: Learning the long-term structure of the blues. In: Dorronsoro, J.R. (ed.) ICANN 2002. LNCS, vol. 2415, pp. 284–289. Springer, Heidelberg (2002)
Eerola, T., Toiviainen, P.: MIDI Toolbox: MATLAB Tools for Music Research. University of Jyväskylä, Jyväskylä, Finland (2004), www.jyu.fi/musica/miditoolbox/
Hinton, G.E.: Training products of experts by minimizing contrastive divergence. Neural Computation 14(8), 1771–1800 (2002)
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Computation 18(7), 1527–1554 (2006)
Lavrenko, V., Pickens, J.: Polyphonic music modeling with random fields. In: Rowe, L.A., Vin, H.M., Plagemann, T., Shenoy, P.J., Smith, J.R. (eds.) Proceedings of the Eleventh ACM International Conference on Multimedia, ACM Multimedia, pp. 120–129. ACM, New York (2003)
Lee, H., Ekanadham, C., Ng, A.Y.: Sparse deep belief net model for visual area V2. In: Platt, J.C., Koller, D., Singer, Y., Roweis, S.T. (eds.) NIPS. Advances in NIPS, vol. 20. MIT Press, Cambridge (2008)
Lee, H., Grosse, R., Ranganath, R., Ng, A.Y.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Danyluk, A.P., Bottou, L., Littman, M.L. (eds.) ICML. ACM ICPS, vol. 382, p. 77. ACM, New York (2009)
Lerdahl, F., Jackendoff, R.: A Generative Theory of Tonal Music. The MIT Press, Cambridge (1983)
Norouzi, M., Ranjbar, M., Mori, G.: Stacks of convolutional restricted Boltzmann machines for shift-invariant feature learning. In: IEEE Computer Society Conference on CVPR, CVRP 2009, pp. 2735–2742. IEEE, Los Alamitos (2009)
Paiement, J.-F.: Probabilistic Models for Music. PhD thesis, Ecole Polytechnique Fédérale de Lausanne (EPFL) (2008)
Ron, D., Singer, Y., Tishby, N.: The power of amnesia. In: Cowan, J.D., Tesauro, G., Alspector, J. (eds.) NIPS. Advances in NIPS, vol. 6, pp. 176–183. Morgan Kaufmann, San Francisco (1994)
Sutskever, I., Hinton, G.E.: Learning multilevel distributed representations for high-dimensional sequences. Journal of ML Research - Proceedings Track 2, 548–555 (2007)
Taylor, G.W., Hinton, G.E.: Factored conditional restricted Boltzmann machines for modeling motion style. In: Danyluk, A.P., Bottou, L., Littman, M.L. (eds.) ICML. ACM ICPS, vol. 382, p. 129. ACM, New York (2009)
Taylor, G.W., Hinton, G.E., Roweis, S.T.: Modeling human motion using binary latent variables. In: Schölkopf, B., Platt, J.C., Hoffman, T. (eds.) NIPS. Advances in NIPS, vol. 19, pp. 1345–1352. MIT Press, Cambridge (2007)
Weiland, M., Smaill, A., Nelson, P.: Learning musical pitch structures with hierarchical hidden Markov models. Technical report, University of Edinburgh (2005)
Welling, M., Rosen-Zvi, M., Hinton, G.E.: Exponential family harmoniums with an application to information retrieval. In: NIPS. Advances in NIPS, vol. 17 (2004)
Wood, F., Archambeau, C., Gasthaus, J., James, L., Teh, Y.W.: A stochastic memoizer for sequence data. In: Danyluk, A.P., Bottou, L., Littman, M.L. (eds.) ICML. ACM ICPS, vol. 382, p. 142. ACM, New York (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Spiliopoulou, A., Storkey, A. (2011). Comparing Probabilistic Models for Melodic Sequences. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2011. Lecture Notes in Computer Science(), vol 6913. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23808-6_19
Download citation
DOI: https://doi.org/10.1007/978-3-642-23808-6_19
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23807-9
Online ISBN: 978-3-642-23808-6
eBook Packages: Computer ScienceComputer Science (R0)