Unsupervised Analysis and Generation of Audio Percussion Sequences

  • Marco Marchini
  • Hendrik Purwins
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6684)


A system is presented that learns the structure of an audio recording of a rhythmical percussion fragment in an unsupervised manner and that synthesizes musical variations from it. The procedure consists of 1) segmentation, 2) symbolization (feature extraction, clustering, sequence structure analysis, temporal alignment), and 3) synthesis. The symbolization step yields a sequence of event classes. Simultaneously, representations are maintained that cluster the events into few or many classes. Based on the most regular clustering level, a tempo estimation procedure is used to preserve the metrical structure in the generated sequence. Employing variable length Markov chains, the final synthesis is performed, recombining the audio material derived from the sample itself. Representations with different numbers of classes are used to trade off statistical significance (short context sequence, low clustering refinement) versus specificity (long context, high clustering refinement) of the generated sequence. For a broad variety of musical styles the musical characteristics of the original are preserved. At the same time, considerable variability is introduced in the generated sequence.


music analysis music generation unsupervised clustering Markov chains machine listening 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
  2. 2.
    Buhlmann, P., Wyner, A.J.: Variable length markov chains. Annals of Statistics 27, 480–513 (1999)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Cope, D.: Virtual Music: Computer Synthesis of Musical Style. MIT Press, Cambridge (2004)Google Scholar
  4. 4.
    Dixon, S.: Automatic extraction of tempo and beat from expressive performances. Journal of New Music Research 30(1), 39–58 (2001)MathSciNetCrossRefGoogle Scholar
  5. 5.
    Dubnov, S., Assayag, G., Cont, A.: Audio oracle: A new algorithm for fast learning of audio structures. In: Proceedings of International Computer Music Conference (ICMC), pp. 224–228 (2007)Google Scholar
  6. 6.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern classification. Wiley, Chichester (2001)zbMATHGoogle Scholar
  7. 7.
    Gillet, O., Richard, G.: Enst-drums: an extensive audio-visual database for drum signals processing. In: ISMIR, pp. 156–159 (2006)Google Scholar
  8. 8.
    Hazan, A., Marxer, R., Brossier, P., Purwins, H., Herrera, P., Serra, X.: What/when causal expectation modelling applied to audio signals. Connection Science 21, 119–143 (2009)CrossRefGoogle Scholar
  9. 9.
    Lartillot, O., Toiviainen, P., Eerola, T.: A matlab toolbox for music information retrieval. In: Annual Conference of the German Classification Society (2007)Google Scholar
  10. 10.
    Marchini, M.: Unsupervised Generation of Percussion Sequences from a Sound Example. Master’s thesis (2010)Google Scholar
  11. 11.
    Marchini, M., Purwins, H.: Unsupervised generation of percussion sound sequences from a sound example. In: Sound and Music Computing Conference (2010)Google Scholar
  12. 12.
    Marxer, R., Purwins, H.: Unsupervised incremental learning and prediction of audio signals. In: Proceedings of 20th International Symposium on Music Acoustics (2010)Google Scholar
  13. 13.
    Pachet, F.: The continuator: Musical interaction with style. In: Proceedings of ICMC, pp. 211–218. ICMA (2002)Google Scholar
  14. 14.
    Purwins, H., Holonowicz, P., Herrera, P.: Polynomial extrapolation for prediction of surprise based on loudness - a preliminary study. In: Sound and Music Computing Conference, Porto (2009)Google Scholar
  15. 15.
    Ron, D., Singer, Y., Tishby, N.: The power of amnesia: learning probabilistic automata with variable memory length. Mach. Learn. 25(2-3), 117–149 (1996)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Marco Marchini
    • 1
  • Hendrik Purwins
    • 1
  1. 1.Music Technology Group, Department of Information and Communications TechnologiesUniversitat Pompeu FabraBarcelonaSpain

Personalised recommendations