Skip to main content

Transcribing Bach Chorales Using Particle Swarm Optimisations

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7928))

Abstract

This paper reports a novel application of particle swarm optimisation to polyphonic transcription task. The system transforms an input audio into activation strength of pitches in the desired range. This transformation begins with audio information in time-domain to frequency-domain and finally, to activation strength of pitches (a.k.a. piano-roll representation). We can infer the likely sounding pitches by comparing the observed activation strength of input audio to reference Tone-models. Although each Tone-model is learned offline from the pitches one wish to perform transcription with, this process often only approximates the Tone-model characteristics due to the variations in volume and other effects introduced from the manner of note executions. Hence, predicting sounding notes based solely on Tone-models gives inaccurate predictions. Here, we apply PSO to search for an optimum aggregation of different predicted pitches that best represents the input activation strength. We describe our problem formulation and the design of our approach. The experimental results show our approach to be of potential in the task of polyphonic transcription.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Martin, K.D.: A blackboard system for automatic transcription of simple polyphonic music. M.I.T. Media Lab, Perceptual Computing. Technical Report. 385 (1996)

    Google Scholar 

  2. Kashino, K., Nakadai, K., Kinoshita, T., Tanaka, H.: Application of bayesian probability network to music scence analysis. In: Proceedings of IJCAI Workshop on CASA, Montreal, pp. 52–59 (1995)

    Google Scholar 

  3. Walmsley, P.J., Godsill, S.J., Rayner, P.J.W.: Bayesian graphical models for polyphonic pitch tracking. In: Proceedings of Diderot Forum on Mathematics and Music, Vienna, Austria, December 2-4, pp. 1–26 (1999)

    Google Scholar 

  4. Davy, M., Godsill, S.J.: Bayesian Harmonic Models for Musical Signal Analysis. In: Bernardo, J.M., Bayarri, M.J., Berger, J.O., Dawid, A.P., Heckerman, D., Smith, A.F.M., West, M. (eds.) Bayesian Statistics 7, pp. 105–124. Oxford University Press (2003)

    Google Scholar 

  5. Marolt, M.: A connectionist approach to automatic transcription of polyphonic piano music. IEEE Transactions on Multimedia 6(3), 439–449 (2004)

    Article  Google Scholar 

  6. Vincent, E., Rodet, X.: Music transcription with ISA and HMM. In: Proceedings of the Fifth International Conference on Independent Component Analysis and Blind Signal Separation (ICA 2004), Gradana, Spain, pp. 1197–1204 (2004)

    Google Scholar 

  7. Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorozation. Nature 401, 788–791 (1999)

    Article  Google Scholar 

  8. Smaragdis, P., Brown, J.C.: Non-negative matric factorization for polyphonic music transcription. In: Proceedings of IEEE Workshop Applications of Signal Processing to Audio and Acoustics, pp. 177–180. New Paltz, NY (2003)

    Google Scholar 

  9. Hoyer, P.O.: Non-negative sparse coding. In: Proceedings of IEEE Workshop on Neural Networks for Signal Provcessing XII, Martigny, Switzerland (2002)

    Google Scholar 

  10. Plumbley, M.D., Abdullah, S.A., Blumensath, T., Davies, M.E.: Sparse representation of polyphonic music. Signal Processing 86(3), 417–431 (2005)

    Article  Google Scholar 

  11. Phon-Amnuaisuk, S.: Transcribing Bach chorales: Limitations and potentials of non-negative matrix factorisation. EURASIP Journal on Audio, Speech and Music Processing 2012, 11 (2012)

    Article  Google Scholar 

  12. Phon-Amnuaisuk, S.: Polyphonic transcription: Exploring a hybrid of tone models and particle swarm optimisation. In: Machado, P., Romero, J., Carballal, A. (eds.) EvoMUSART 2012. LNCS, vol. 7247, pp. 211–222. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  13. Bertin, N., Badeau, R., Vincent, E.: Enforcing Harmonicity and Smoothness in Bayesian Non-Negative Matrix Factorization Applied to Polyphonic Music Transcription. IEEE Transactions on Audio, Speech, and Language Processing 18(3), 538–549 (2010)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Phon-Amnuaisuk, S. (2013). Transcribing Bach Chorales Using Particle Swarm Optimisations. In: Tan, Y., Shi, Y., Mo, H. (eds) Advances in Swarm Intelligence. ICSI 2013. Lecture Notes in Computer Science, vol 7928. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38703-6_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-38703-6_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-38702-9

  • Online ISBN: 978-3-642-38703-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics