Skip to main content

Pitch-Related Identification of Instruments in Classical Music Recordings

  • Conference paper
  • First Online:
New Frontiers in Mining Complex Patterns (NFMCP 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8983))

Included in the following conference series:

  • 579 Accesses

Abstract

Identification of particular voices in polyphonic and polytimbral music is a task often performed by musicians in their everyday life. However, the automation of this task is very challenging, because of high complexity of audio data. Usually additional information is supplied, and the results are far from satisfactory. In this paper, we focus on classical music recordings, without requiring the user to submit additional information. Our goal is to identify musical instruments playing in short audio frames of polyphonic recordings of classical music. Additionally, we extract pitches (or pitch ranges) which combined with instrument information can be used in score-following and audio alignment, see e.g. [9, 20], or in works towards automatic score extraction, which are a motivation behind this work. Also, since instrument timbre changes with pitch, separate classifiers are trained for various pitch ranges for each instrument. Four instruments are investigated, representing stringed and wind instruments. The influence of adding harmonic (pitch-based) features to the feature set on the results is also investigated. Random forests are applied as a classification tool, and the results are presented and discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Boulanger-Lewandowski, N., Bengio, Y., Vincent, P.: Discriminative non-negative matrix factorization for multiple pitch estimation. In: 13th International Society for Music Information Retrieval Conference (ISMIR), pp. 205–210 (2012)

    Google Scholar 

  2. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  3. Essid, S., Richard, G., David, B.: Instrument recognition in polyphonic music based on automatic taxonomies. IEEE Trans. Audio Speech Lang. Process. 14(1), 68–80 (2006)

    Article  Google Scholar 

  4. Goto, M., Hashiguchi, H., Nishimura, T., Oka, R.: RWC music database: popular, classical, and jazz music databases. In: 3rd International Conference on Music Information Retrieval, pp. 287–288 (2002)

    Google Scholar 

  5. Goto, M., Hashiguchi, H., Nishimura, T., Oka, R.: RWC music database: music genre database and musical instrument sound database. In: 4th International Conference on Music Information Retrieval, pp. 229–230 (2003)

    Google Scholar 

  6. Heittola, T., Klapuri, A., Virtanen, A.: Musical instrument recognition in polyphonic audio using source-filter model for sound separation. In: 10th International Society for Music Information Retrieval Conference (2009)

    Google Scholar 

  7. Herrera-Boyer, P., Klapuri, A., Davy, M.: Automatic classification of pitched musical instrument sounds. In: Klapuri, A., Davy, M. (eds.) Signal Processing Methods for Music Transcription. Springer Science+Business Media LLC, US (2006)

    Google Scholar 

  8. ISO: MPEG-7 overview. http://www.chiariglione.org/mpeg/

  9. Izmirli, O., Sharma, G.: Bridging printed music and audio through alignment using a mid-level score representation. In: 13th International Society for Music Information Retrieval Conference (ISMIR), pp. 61–66 (2012)

    Google Scholar 

  10. Kameoka, H., Nishimoto, T., Sagayama, S.: Multi-pitch detection algorithm using constrained gaussian mixture model and information criterion for simultaneous speech. In: Speech Prosody 2004, pp. 533–536 (2004)

    Google Scholar 

  11. Kirchhoff, H., Dixon, S., Klapuri, A.: Multi-template shift-variant non-negative matrix deconvolution for semi-automatic music transcription. In: 13th International Society for Music Information Retrieval Conference (ISMIR), pp. 415–420 (2012)

    Google Scholar 

  12. Kitahara, T., Goto, M., Komatani, K., Ogata, T., Okuno, H.G.: Instrument identification in polyphonic music: feature weighting to minimize influence of sound overlaps. EURASIP J. Appl. Signal Process. 2007, 1–15 (2007)

    Article  Google Scholar 

  13. Kubera, E., Wieczorkowska, A., Raś, Z., Skrzypiec, M.: Recognition of instrument timbres in real polytimbral audio recordings. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010, Part II. LNCS, vol. 6322, pp. 97–110. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  14. Kubera, E., Wieczorkowska, A.A.: Mining audio data for multiple instrument recognition in classical music. In: Appice, A., Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z.W. (eds.) NFMCP 2013. LNCS, vol. 8399, pp. 246–260. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  15. Kubera, E., Wieczorkowska, A.A., Skrzypiec, M.: Influence of feature sets on precision, recall, and accuracy of identification of musical instruments in audio recordings. In: Andreasen, T., Christiansen, H., Cubero, J.-C., Raś, Z.W. (eds.) ISMIS 2014. LNCS, vol. 8502, pp. 204–213. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  16. Kursa, M., Rudnicki, W., Wieczorkowska, A., Kubera, E., Kubik-Komar, A.: Musical instruments in random forest. In: Rauch, J., Raś, Z.W., Berka, P., Elomaa, T. (eds.) ISMIS 2009. LNCS, vol. 5722, pp. 281–290. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  17. Martins, L.G., Burred, J.J., Tzanetakis, G., Lagrange, M.: Polyphonic instrument recognition using spectral clustering. In: 8th International Society for Music Information Retrieval Conference (ISMIR) (2007)

    Google Scholar 

  18. Max-Planck-Institut Informatik: chroma toolbox: pitch, chroma, CENS, CRP. http://www.mpi-inf.mpg.de/resources/MIR/chromatoolbox/

  19. MIDOMI: Search for music using your voice by singing or humming. http://www.midomi.com/

  20. Miotto, R., Montecchio, N., Orio, N.: Statistical music modeling aimed at identification and alignment. In: Raś, Z.W., Wieczorkowska, A.A. (eds.) Adv. Music Inform. Retrieval. SCI, vol. 274, pp. 187–212. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  21. Müller, M.: Information Retrieval for Music and Motion. Springer, Heidelberg (2007)

    Book  Google Scholar 

  22. Niewiadomy, D., Pelikant, A.: Implementation of MFCC vector generation in classification context. J. Appl. Comput. Sci. 16(2), 55–65 (2008)

    Google Scholar 

  23. Opolko, F., Wapnick, J.: MUMS – McGill University master samples: CD’s (1987)

    Google Scholar 

  24. Oxford University press: Oxford dictionaries. http://www.oxforddictionaries.com/

  25. Sakaue, D., Otsuka, T., Itoyama, K., Okuno, H.G.: Bayesian nonnegative harmonic-temporal factorization and its application to multipitch analysis. In: 13th International Society for Music Information Retrieval Conference (ISMIR), pp. 91–96 (2012)

    Google Scholar 

  26. Shazam entertainment ltd. http://www.shazam.com/

  27. Subrahmanian, V.S.: Principles of Multimedia Database Systems. Morgan Kaufmann, San Francisco (1998)

    Google Scholar 

  28. The University of IOWA electronic music studios: musical instrument samples. http://theremin.music.uiowa.edu/MIS.html

  29. TrackID. https://play.google.com/store/apps/details?id=com.sonyericsson.trackid

  30. Vincent, E., Rodet, X.: Music transcription with ISA and HMM. In: Puntonet, C.G., Prieto, A.G. (eds.) ICA 2004. LNCS, vol. 3195, pp. 1197–1204. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  31. Zhang, X., Marasek, K., Ras, Z.W.: Maximum likelihood study for sound pattern separation and recognition. In: IEEE CS International Conference on Multimedia and Ubiquitous Engineering (MUE 2007), Seoul, Korea, pp. 807–812 (2007)

    Google Scholar 

  32. Zweig, G., Nguyen, P.: A segmental CRF approach to large vocabulary continuous speech recognition. In: ASRU 2009: Automatic Speech Recognition and Understanding (2009)

    Google Scholar 

Download references

Acknowledgments

This work was partially supported by the Research Center of PJAIT, supported by the Ministry of Science and Higher Education in Poland.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elżbieta Kubera .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Kubera, E., Wieczorkowska, A.A. (2015). Pitch-Related Identification of Instruments in Classical Music Recordings. In: Appice, A., Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z. (eds) New Frontiers in Mining Complex Patterns. NFMCP 2014. Lecture Notes in Computer Science(), vol 8983. Springer, Cham. https://doi.org/10.1007/978-3-319-17876-9_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-17876-9_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-17875-2

  • Online ISBN: 978-3-319-17876-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics