Content-Based Music Retrieval and Visualization System for Ethnomusicological Music Archives

  • Michael BlaßEmail author
  • Rolf Bader
Part of the Current Research in Systematic Musicology book series (CRSM, volume 5)


In this chapter we propose a content-based exploration and visualization system for ethnomusicological archives that allows for data access by rhythm similarity. The system extracts an onsets-synchronous timbre feature of each audio file of a given collection. From the resulting time series, Hidden Markov Model are trained. The transition probability matrices of the models are considered a rhythm fingerprint that represents the musics rhythmic structure in terms of timbre. The self-organizing map algorithm is utilized to project the high-dimensional fingerprints onto a two-dimensional map. This technique preserves the topology of the high-dimensional feature space, which results in similar map positions for similar rhythms. A clustering by rhythm similarity is thus achieved. The system, therefore, supports musicologist studies in several ways: the rhythm fingerprinting does neither imply a certain theory of music nor introduce cultural bias. Hence, different musics can be compared meaningfully regardless of their origin. Retrieval by similarity allows for an explorative approach to the music collections, which can support researchers in finding new hypothesis and utilizing music archives with few or without meta data. The system is currently prototyped in the Ethnographic Sound Recordings Archive of the University of Hamburg as a part of the COMSAR project.


  1. 1.
    van Kranenburg P, de Bruin M, Volk A (2017) Documenting a song culture: the Dutch song database as a resource for musicological research. Int J Digit Libr 1–11Google Scholar
  2. 2.
    Fillon T, Simonnot J, Mifune M-F, Khoury S, Pellerin G, Coz ML, de la Bretèque EA, Doukhan D, Fourer D (2014) Telemeta: an open-source web framework for ethnomusicological audio archives management and automatic analysis. In: Proceedings of the 1st international workshop on digital libraries for musicology, New York, pp 1–8Google Scholar
  3. 3.
    Abdallah S, Benetos E, Gold N, Hargreaves S, Weyde T, Wolff D (2017) The digital music lab: a big data infrastructure for digital musicology. ACM J Comput Cult Herit 10(1)CrossRefGoogle Scholar
  4. 4.
    Pfeiffer S, Fischer S, Effelsberg W (1996) Automatic audio content analysis. In: Proceedings of the forth ACM international conference on multimedia, Boston, MA, USA, November 1996Google Scholar
  5. 5.
    Melucci M, Orio N (1999) Music information retrieval using melodic surface. In: Proceedings of the fourth ACM conference on digital libraries, Berkley, CA, USA, August 1999, pp 152–160Google Scholar
  6. 6.
    Tseng Y-H (1999) Content-based retrieval for music collections. In: Proceedings of the 22nd annual international ACM SIGIR, Berkeley, CA, USA, August 1999, pp 176–182Google Scholar
  7. 7.
    Melucci M, Orio N (2000) Smile: a system for content-based music information retrieval environments. In: RIAO’ 2000 conference proceedings, vol 2, pp 1261–1275Google Scholar
  8. 8.
    Frühwirth M, Rauber A (2001) Self-organizing maps for content-based music clustering. In: Tagliaferri R, Marinaro M (eds) Proceedings of the 12th Italian workshop on neural nets. Perspectives in neural computing, Vietri sil Mare, Salerno, Italy, May 2001Google Scholar
  9. 9.
    Rauber A, Frühwirth M (2001) Automatically analyzing and organizing music archives. In: Constantopoulos P, Sølvberg IT (eds) Research and advanced technology for digital libraries. Lecture notes in computer science, Darmstadt, September 2001, pp 402–414Google Scholar
  10. 10.
    Pamplak E (2001) Islands of music. PhD dissertation, Institut für Softwaretechnik und Interaktive Systeme der Technischen Universit at Wien, Dezember 2001Google Scholar
  11. 11.
    Juhász Z (2009) Automatic segmentation and comparative study of motives in eleven folk song collections using self-organizing maps and multidimensional mapping. J New Music Res 38(1):77–85CrossRefGoogle Scholar
  12. 12.
    Juhász Z (2011) Low dimensional visualization of folk music systems using the self organizing cloud. In: Klapuri A, Leider C (eds) Proceedings of the 12th international society for music information retrieval conference, ISMIR 2011, Miami, Florida, USA, 24–28 October 2011. University of Miami, pp 299–304 [Online].
  13. 13.
    Panteli M, Benetos E, Dixon S (2016) Learning a features space for similarity in world music. In: Proceedings of the 17th international society for music information retrieval conferenceGoogle Scholar
  14. 14.
  15. 15.
    Mohamed Eff. el Akkad C. Taxim rast (ala alwahda).
  16. 16.
    Blaß M (2013) Timbre-based rhythm theory using Hidden Markov models. Master’s thesis, University of HamburgGoogle Scholar
  17. 17.
    Blaß M (2013) Timbre-based drum pattern classification using Hidden Markov models. In: Proceedings of the 6th international workshop on machine learning and music, ECML/PKDDGoogle Scholar
  18. 18.
    Mauch M, Dixon S (2012) A corpus-based study of rhythm patterns. In: Proceedings of the 13th international society for music information retrieval conference (ISMIR)Google Scholar
  19. 19.
    Desain P (1992) A (de)composable theory of rhythm perception. Music Percept 9(4):439–454CrossRefGoogle Scholar
  20. 20.
    Alluri V, Toiviainen P (2009) Exploring perceptual and acoustical correlates of polyphonic timbre. Music Percept Interdiscip J 27(3):223–242CrossRefGoogle Scholar
  21. 21.
    Zucchini W, MacDonald IL (2009) Hidden Markov models for time series. Monographs on statistics and applied probability, vol 110. Chapman & Hall, Boca RatonCrossRefGoogle Scholar
  22. 22.
    Aucouturier J-J, Sandler M (2001) Segmentation of musical signals using Hidden Markov models. In: Proceedings of the 110th audio engineering society, Amsterdam, The Netherlands, May 2001Google Scholar
  23. 23.
    Mavromatis P (2012) Exploring the rhythm of the palestrine style: a case study in probabilistic grammar induction. J Music Theory 56(2):169–223CrossRefGoogle Scholar
  24. 24.
    Shao X, Xu C, Kankanhalli M (2004) Unsupervised classification of music genre using hidden Markov model. In: IEEE international conference on multimedia and expo (ICME), vol 3, pp 2023–2026Google Scholar
  25. 25.
    Braasch J (2013) The \(\mu \cdot \) cos\(m\) project: an introspective platform to study intelligent agents in the context of music ensemble improvisation. In: Bader R (ed) Sound–perception–performance. Current research in systematic musicology, vol 1. Springer, HeidelbergGoogle Scholar
  26. 26.
    Alexandraki C (2014) Real-time machine listening and segmental re-synthesis for networked music performance. PhD dissertation, University of HamburgGoogle Scholar
  27. 27.
    Rabiner LR, Juang BH (1986) An introduction to Hidden Markov models. IEEE ASSP MagGoogle Scholar
  28. 28.
    Rabiner LR (1989) A tutorial on Hidden Markov models and selected applications in speech recognition. In: Proceedings of the IEEE, vol 77, no 2. IEEE, pp 257–286Google Scholar
  29. 29.
    Aucouturier J-J, Pachet F (2002) Music similarity measures: what’s the use? In: Proceedings of the 3rd international society for music information retrieval conference, ISMIRGoogle Scholar
  30. 30.
    Aucouturier J-J, Pachet F, Sandler M (2005) The way it sounds: timbre models for analysis and retrieval of music signals. IEEE Trans Multimed 7(6):1028–1035CrossRefGoogle Scholar
  31. 31.
    Davis S, Mermelstein P (1980) Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Acoust Speech Signal Process 28(4):357–366CrossRefGoogle Scholar
  32. 32.
    Iverson P, Krumhansl CL (1993) Isolating the dynamic attributes of musical timbre. J Acoust Soc Am 94(5):2595–2603CrossRefGoogle Scholar
  33. 33.
    McAdams S, Winsberg S, Donnadieu S, Soete GD, Krimphoff J (1995) Perceptual scaling of sythesized musical timbres: common dimensions, specificities, and latent subject classes. Psychol Rev 58:177–192Google Scholar
  34. 34.
    Hourdin C, Charbonneau G, Moussa T (1997) A multidimensional scaling analysis of musical instruments’ time-varying spectra. Comput Music J 21(2):44–55Google Scholar
  35. 35.
    von Bismarck G (1974) Timbre of steady sounds: a factorial investigation of its verbal attributes. Acoustica 3(3):146–159Google Scholar
  36. 36.
    Zacharakis AI, Pastiadis K, Papadelis G, Reiss JD (2011) An investigation of musical timbre: uncovering salient semantic descriptors and perceptual dimensions. In: Klapuri A, Leider C (eds) Proceedings of the 12th international society for music information retrieval conference, ISMIR 2011, Miami, Florida, USA, 24–28 October 2011. University of Miami, pp 807–812 [Online].
  37. 37.
    Grey JM (1977) Multidimensional perceptual scalings of musical timbres. J Acoust Soc Am 61(5):1270–1277CrossRefGoogle Scholar
  38. 38.
    Grey JM, Gordon JW (1978) Perceptual effects of spectral modifications on musical timbres. J Acoust Soc Am 63(5)CrossRefGoogle Scholar
  39. 39.
    Schubert E, Wolfe J, Tarnopolsky A (2004) Spectral centroid and timbre in complex, multiple instrumental textures. In: Proceedings of the 8th international conference on music perception and cognition, pp 654–657Google Scholar
  40. 40.
    Schubert E, Wolfe J (2006) Does timbral brightness scale with frequency and spectral centroid. Acta Acoust 92(2):820–825Google Scholar
  41. 41.
    Siedenburg K, Fujinaga I, McAdams S (2016) A comparison of approaches to timbre descriptors in music information retrieval and music psychology. J New Music Res 45(1):27–41CrossRefGoogle Scholar
  42. 42.
    Park Y-S, Chon T-S, Bae M-J, Kim D-H, Lek S (2017) Ecological informatics. In: Multivariate data analysis by means of self-organizing maps. Springer, pp 251–272Google Scholar
  43. 43.
    Resta M (2014) Financial self-organizing maps. In: Proceedings of the 24th international conference on artificial neural networks, Hamburg, pp 781–788Google Scholar
  44. 44.
    Toiviainen P (2005) Visualization of tonal content with self-organizing maps and self-similarity matrices. ACM Comput Entertain 3(4):1–10CrossRefGoogle Scholar
  45. 45.
    Vembu S, Baumann S (2004) A self-organizing map based knowledge discovery for music recommendation systems. In: Computer music modeling and retrieval: second international symposium (CMMR), vol 3310. Lecture notes in computer science, Esbjerg, Denmark, May 2004Google Scholar
  46. 46.
    Ness SR, Tzanetakis G (2009) Somba: multiuser music creation using self-organizing maps and motion tracking. In: Proceedings of the international computer music conference (ICMC)Google Scholar
  47. 47.
    Odowichuk G, Tzanetakis G (2012) Browsing music in and sound using gestures in a self-organized 3d space. In: Proceedings of the international computer music conference (ICMC)Google Scholar
  48. 48.
    Lötsch J, Ultsch A (2014) Exploiting the structures of the u-matrix. In: Proceedings of the 10th international workshop on self-organizing maps, pp 249–257CrossRefGoogle Scholar
  49. 49.
    van der Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605Google Scholar
  50. 50.
    Flexer A (2001) On the use of self-organizing maps for clustering and visualization. Intell Data Anal 1:373–384CrossRefGoogle Scholar
  51. 51.
    Bello JP, Daudet L, Abdallah S, Duxbury C, Davis M, Sandler MB (2005) A tutorial on onset detection in music signals. IEEE Trans Speech Audio Process 13(5):1035–1047CrossRefGoogle Scholar
  52. 52.
    Dixon S (2006) Onset detection revisited. In: Proceedings of the 9th international conference on digital audio effects (DAFx-06), pp 18–20Google Scholar
  53. 53.
    n’Dri L, Aya T, n’Dri Akissi K. Aoussi.
  54. 54.
    Glover J, Lazzarini V, Timoney J (2011) Real-time detection of musical onsets with linear prediction and sinusoidal modelling. J Adv Signal Process 68:297–316Google Scholar
  55. 55.
    Leveau P, Daudet L, Richard G (2004) Methodology and tools for the evaluation of automatic onset detection algorithms in music. In: Proceedings of the 5th international conference on music information retrievalGoogle Scholar
  56. 56.
    Flexer A, Schnitzer D, Schlüter J (2012) A MIREX meta-analysis of hubness in audio music similarity. In: Proceedings of the international conference on music information retrievalGoogle Scholar
  57. 57.
    Flexer A (2015) Improving visualization for high-dimensional music similarity spaces. In: Proceedings of the 16th international conference for music information retrievalGoogle Scholar
  58. 58.
    Le T, Cuturi M (2015) Unsupervised Riemannian metric learning for histograms using Aitchison transformations. In: Proceedings of the 32nd international conference on machine learning, vol 37Google Scholar
  59. 59.
    Klapuri A, Leider C (eds) (2011) Proceedings of the 12th international society for music information retrieval conference, ISMIR 2011, Miami, Florida, USA, 24–28 October 2011. University of Miami [Online].

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.HamburgGermany

Personalised recommendations