Transcription of Musical Audio Using Poisson Point Processes and Sequential MCMC

  • Pete Bunch
  • Simon Godsill
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6684)


In this paper models and algorithms are presented for transcription of pitch and timings in polyphonic music extracts. The data are decomposed framewise into the frequency domain, where a Poisson point process model is used to write a polyphonic pitch likelihood function. From here Bayesian priors are incorporated both over time (to link successive frames) and also within frames (to model the number of notes present, their pitches, the number of harmonics for each note, and inharmonicity parameters for each note). Inference in the model is carried out via Bayesian filtering using a powerful Sequential Markov chain Monte Carlo (MCMC) algorithm that is an MCMC extension of particle filtering. Initial results with guitar music, both laboratory test data and commercial extracts, show promising levels of performance.


Automated music transcription multi-pitch estimation Bayesian filtering Poisson point process Markov chain Monte Carlo particle filter spatio-temporal dynamical model 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Cemgil, A., Godsill, S.J., Peeling, P., Whiteley, N.: Bayesian statistical methods for audio and music processing. In: O’Hagan, A., West, M. (eds.) Handbook of Applied Bayesian Analysis, OUP (2010)Google Scholar
  2. 2.
    Davy, M., Godsill, S., Idier, J.: Bayesian analysis of polyphonic western tonal music. Journal of the Acoustical Society of America 119(4) (April 2006)Google Scholar
  3. 3.
    Gilks, W.R., Richardson, S., Spiegelhalter, D.J. (eds.): Markov Chain Monte Carlo in Practice. Chapman and Hall, Boca Raton (1996)zbMATHGoogle Scholar
  4. 4.
    Godsill, S.J., Davy, M.: Bayesian computational models for inharmonicity in musical instruments. In: Proc. of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, NY (October 2005)Google Scholar
  5. 5.
    Kashino, K., Nakadai, K., Kinoshita, T., Tanaka, H.: Application of the Bayesian probability network to music scene analysis. In: Rosenthal, D.F., Okuno, H. (eds.) Computational Audio Scene Analysis, pp. 115–137. Lawrence Erlbaum Associates, Mahwah (1998)Google Scholar
  6. 6.
    Klapuri, A., Davy, M.: Signal processing methods for music transcription. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  7. 7.
    Pang, S.K., Godsill1, S.J., Li, J., Septier, F.: Sequential inference for dynamically evolving groups of objects. To appear: Barber, Cemgil, Chiappa (eds.) Inference and Learning in Dynamic Models, CUP (2009) Google Scholar
  8. 8.
    Peeling, P.H., Li, C., Godsill, S.J.: Poisson point process modeling for polyphonic music transcription. Journal of the Acoustical Society of America Express Letters 121(4), EL168–EL175 (2007)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Pete Bunch
    • 1
  • Simon Godsill
    • 1
  1. 1.Signal Processing and Communications Laboratory, Department of EngineeringUniversity of CambridgeUK

Personalised recommendations