Labelling the Structural Parts of a Music Piece with Markov Models

  • Jouni Paulus
  • Anssi Klapuri
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5493)


This paper describes a method for labelling structural parts of a musical piece. Existing methods for the analysis of piece structure often name the parts with musically meaningless tags, e.g., “p1”, “p2”, “p3”. Given a sequence of these tags as an input, the proposed system assigns musically more meaningful labels to these; e.g., given the input “p1, p2, p3, p2, p3” the system might produce “intro, verse, chorus, verse, chorus”. The label assignment is chosen by scoring the resulting label sequences with Markov models. Both traditional and variable-order Markov models are evaluated for the sequence modelling. Search over the label permutations is done with N-best variant of token passing algorithm. The proposed method is evaluated with leave-one-out cross-validations on two large manually annotated data sets of popular music. The results show that Markov models perform well in the desired task.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Peeters, G.: Deriving musical structure from signal analysis for music audio summary generation: ”sequence” and ”state” approach. In: Wiil, U.K. (ed.) CMMR 2003. LNCS, vol. 2771, pp. 143–166. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  2. 2.
    Ong, B.S.: Structural analysis and segmentation of musical signals. Ph.D thesis, Universitat Pompeu Fabra, Barcelona (2006)Google Scholar
  3. 3.
    Shiu, Y., Jeong, H., Kuo, C.C.J.: Musical structure analysis using similarity matrix and dynamic programming. In: Proc. of SPIE. Multimedia Systems and Applications VIII, vol. 6015 (2005)Google Scholar
  4. 4.
    Maddage, N.C.: Automatic structure detection for popular music. IEEE Multimedia 13(1), 65–77 (2006)CrossRefGoogle Scholar
  5. 5.
    Goto, M.: A chorus-section detecting method for musical audio signals. In: Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, pp. 437–440 (2003)Google Scholar
  6. 6.
    Boutard, G., Goldszmidt, S., Peeters, G.: Browsing inside a music track, the experimentation case study. In: Proc. of 1st Workshop on Learning the Semantics of Audio Signals, Athens, pp. 87–94 (December 2006)Google Scholar
  7. 7.
    Pollack, A.W.: ‘Notes on...’ series. The Official Home Page (1989-2001),
  8. 8.
    Jurafsky, D., Martin, J.H.: Speech and language processing. Prentice-Hall, New Jersey (2000)Google Scholar
  9. 9.
    Young, S.J., Russell, N.H., Thornton, J.H.S.: Token passing: a simple conceptual model for connected speech recognition systems. Technical Report CUED/F-INFENG/TR38, Cambridge University Engineering Department, Cambridge, UK (July 1989)Google Scholar
  10. 10.
    Ron, D., Singer, Y., Tishby, N.: The power of amnesia: Learning probabilistic automata with variable memory length. Machine Learning 25(2–3), 117–149 (1996)CrossRefzbMATHGoogle Scholar
  11. 11.
    Witten, I.H., Bell, T.C.: The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression. IEEE Transcations on Information Theory 37(4), 1085–1094 (1991)CrossRefGoogle Scholar
  12. 12.
    Begleiter, R., El-Yaniv, R., Yona, G.: On prediction using variable order Markov models. Journal of Artificial Intelligence Research 22, 385–421 (2004)MathSciNetzbMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Jouni Paulus
    • 1
  • Anssi Klapuri
    • 1
  1. 1.Department of Signal ProcessingTampere University of TechnologyTampereFinland

Personalised recommendations