Abstract
This paper reports a novel application of particle swarm optimisation to polyphonic transcription task. The system transforms an input audio into activation strength of pitches in the desired range. This transformation begins with audio information in time-domain to frequency-domain and finally, to activation strength of pitches (a.k.a. piano-roll representation). We can infer the likely sounding pitches by comparing the observed activation strength of input audio to reference Tone-models. Although each Tone-model is learned offline from the pitches one wish to perform transcription with, this process often only approximates the Tone-model characteristics due to the variations in volume and other effects introduced from the manner of note executions. Hence, predicting sounding notes based solely on Tone-models gives inaccurate predictions. Here, we apply PSO to search for an optimum aggregation of different predicted pitches that best represents the input activation strength. We describe our problem formulation and the design of our approach. The experimental results show our approach to be of potential in the task of polyphonic transcription.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Martin, K.D.: A blackboard system for automatic transcription of simple polyphonic music. M.I.T. Media Lab, Perceptual Computing. Technical Report. 385 (1996)
Kashino, K., Nakadai, K., Kinoshita, T., Tanaka, H.: Application of bayesian probability network to music scence analysis. In: Proceedings of IJCAI Workshop on CASA, Montreal, pp. 52–59 (1995)
Walmsley, P.J., Godsill, S.J., Rayner, P.J.W.: Bayesian graphical models for polyphonic pitch tracking. In: Proceedings of Diderot Forum on Mathematics and Music, Vienna, Austria, December 2-4, pp. 1–26 (1999)
Davy, M., Godsill, S.J.: Bayesian Harmonic Models for Musical Signal Analysis. In: Bernardo, J.M., Bayarri, M.J., Berger, J.O., Dawid, A.P., Heckerman, D., Smith, A.F.M., West, M. (eds.) Bayesian Statistics 7, pp. 105–124. Oxford University Press (2003)
Marolt, M.: A connectionist approach to automatic transcription of polyphonic piano music. IEEE Transactions on Multimedia 6(3), 439–449 (2004)
Vincent, E., Rodet, X.: Music transcription with ISA and HMM. In: Proceedings of the Fifth International Conference on Independent Component Analysis and Blind Signal Separation (ICA 2004), Gradana, Spain, pp. 1197–1204 (2004)
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorozation. Nature 401, 788–791 (1999)
Smaragdis, P., Brown, J.C.: Non-negative matric factorization for polyphonic music transcription. In: Proceedings of IEEE Workshop Applications of Signal Processing to Audio and Acoustics, pp. 177–180. New Paltz, NY (2003)
Hoyer, P.O.: Non-negative sparse coding. In: Proceedings of IEEE Workshop on Neural Networks for Signal Provcessing XII, Martigny, Switzerland (2002)
Plumbley, M.D., Abdullah, S.A., Blumensath, T., Davies, M.E.: Sparse representation of polyphonic music. Signal Processing 86(3), 417–431 (2005)
Phon-Amnuaisuk, S.: Transcribing Bach chorales: Limitations and potentials of non-negative matrix factorisation. EURASIP Journal on Audio, Speech and Music Processing 2012, 11 (2012)
Phon-Amnuaisuk, S.: Polyphonic transcription: Exploring a hybrid of tone models and particle swarm optimisation. In: Machado, P., Romero, J., Carballal, A. (eds.) EvoMUSART 2012. LNCS, vol. 7247, pp. 211–222. Springer, Heidelberg (2012)
Bertin, N., Badeau, R., Vincent, E.: Enforcing Harmonicity and Smoothness in Bayesian Non-Negative Matrix Factorization Applied to Polyphonic Music Transcription. IEEE Transactions on Audio, Speech, and Language Processing 18(3), 538–549 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Phon-Amnuaisuk, S. (2013). Transcribing Bach Chorales Using Particle Swarm Optimisations. In: Tan, Y., Shi, Y., Mo, H. (eds) Advances in Swarm Intelligence. ICSI 2013. Lecture Notes in Computer Science, vol 7928. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38703-6_23
Download citation
DOI: https://doi.org/10.1007/978-3-642-38703-6_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38702-9
Online ISBN: 978-3-642-38703-6
eBook Packages: Computer ScienceComputer Science (R0)