Transcribing Bach Chorales Using Particle Swarm Optimisations

  • Somnuk Phon-Amnuaisuk
Part of the Lecture Notes in Computer Science book series (LNCS, volume 7928)


This paper reports a novel application of particle swarm optimisation to polyphonic transcription task. The system transforms an input audio into activation strength of pitches in the desired range. This transformation begins with audio information in time-domain to frequency-domain and finally, to activation strength of pitches (a.k.a. piano-roll representation). We can infer the likely sounding pitches by comparing the observed activation strength of input audio to reference Tone-models. Although each Tone-model is learned offline from the pitches one wish to perform transcription with, this process often only approximates the Tone-model characteristics due to the variations in volume and other effects introduced from the manner of note executions. Hence, predicting sounding notes based solely on Tone-models gives inaccurate predictions. Here, we apply PSO to search for an optimum aggregation of different predicted pitches that best represents the input activation strength. We describe our problem formulation and the design of our approach. The experimental results show our approach to be of potential in the task of polyphonic transcription.


Particle swarm optimisation Polyphonic transcription Tone-models Transcribing Bach’s Chorales 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Martin, K.D.: A blackboard system for automatic transcription of simple polyphonic music. M.I.T. Media Lab, Perceptual Computing. Technical Report. 385 (1996)Google Scholar
  2. 2.
    Kashino, K., Nakadai, K., Kinoshita, T., Tanaka, H.: Application of bayesian probability network to music scence analysis. In: Proceedings of IJCAI Workshop on CASA, Montreal, pp. 52–59 (1995)Google Scholar
  3. 3.
    Walmsley, P.J., Godsill, S.J., Rayner, P.J.W.: Bayesian graphical models for polyphonic pitch tracking. In: Proceedings of Diderot Forum on Mathematics and Music, Vienna, Austria, December 2-4, pp. 1–26 (1999)Google Scholar
  4. 4.
    Davy, M., Godsill, S.J.: Bayesian Harmonic Models for Musical Signal Analysis. In: Bernardo, J.M., Bayarri, M.J., Berger, J.O., Dawid, A.P., Heckerman, D., Smith, A.F.M., West, M. (eds.) Bayesian Statistics 7, pp. 105–124. Oxford University Press (2003)Google Scholar
  5. 5.
    Marolt, M.: A connectionist approach to automatic transcription of polyphonic piano music. IEEE Transactions on Multimedia 6(3), 439–449 (2004)CrossRefGoogle Scholar
  6. 6.
    Vincent, E., Rodet, X.: Music transcription with ISA and HMM. In: Proceedings of the Fifth International Conference on Independent Component Analysis and Blind Signal Separation (ICA 2004), Gradana, Spain, pp. 1197–1204 (2004)Google Scholar
  7. 7.
    Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorozation. Nature 401, 788–791 (1999)CrossRefGoogle Scholar
  8. 8.
    Smaragdis, P., Brown, J.C.: Non-negative matric factorization for polyphonic music transcription. In: Proceedings of IEEE Workshop Applications of Signal Processing to Audio and Acoustics, pp. 177–180. New Paltz, NY (2003)Google Scholar
  9. 9.
    Hoyer, P.O.: Non-negative sparse coding. In: Proceedings of IEEE Workshop on Neural Networks for Signal Provcessing XII, Martigny, Switzerland (2002)Google Scholar
  10. 10.
    Plumbley, M.D., Abdullah, S.A., Blumensath, T., Davies, M.E.: Sparse representation of polyphonic music. Signal Processing 86(3), 417–431 (2005)CrossRefGoogle Scholar
  11. 11.
    Phon-Amnuaisuk, S.: Transcribing Bach chorales: Limitations and potentials of non-negative matrix factorisation. EURASIP Journal on Audio, Speech and Music Processing 2012, 11 (2012)CrossRefGoogle Scholar
  12. 12.
    Phon-Amnuaisuk, S.: Polyphonic transcription: Exploring a hybrid of tone models and particle swarm optimisation. In: Machado, P., Romero, J., Carballal, A. (eds.) EvoMUSART 2012. LNCS, vol. 7247, pp. 211–222. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  13. 13.
    Bertin, N., Badeau, R., Vincent, E.: Enforcing Harmonicity and Smoothness in Bayesian Non-Negative Matrix Factorization Applied to Polyphonic Music Transcription. IEEE Transactions on Audio, Speech, and Language Processing 18(3), 538–549 (2010)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Somnuk Phon-Amnuaisuk
    • 1
  1. 1.Music Informatics Research Group, Faculty of Business and ComputingBrunei Institute of Technology, Brunei DarussalamBrunei

Personalised recommendations