Abstract
This chapter presents several computational approaches aimed at supporting knowledge discovery in music. Our work combines data mining, signal processing and data visualization techniques for the automatic analysis of digital music collections, with a focus on retrieving and understanding musical structure.
We discuss the extraction of midlevel feature representations that convey musically meaningful information from audio signals, and show how such representations can be used to synchronize different instances of a musical work and enable new modes of music content browsing and navigation. Moreover, we utilize these representations to identify repetitive structures and representative patterns in the signal, via self-similarity analysis and matrix decomposition techniques that can be made invariant to changes of local tempo and key. We discuss how structural information can serve to highlight relationships within music collections, and explore the use of information visualization tools to characterize the patterns of similarity and dissimilarity that underpin such relationships.
With the help of illustrative examples computed on a collection of recordings of Frédéric Chopin’s Mazurkas, we aim to show how these content-based methods can facilitate the development of novel modes of access, analysis and interaction with digital content that can empower the study and appreciation of music.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsAbbreviations
- 2-D:
-
two-dimensional
- MFCC:
-
Mel-frequency cepstral coefficient
- MIDI:
-
musical instrument digital interface
- MIR:
-
music information retrieval
- NCD:
-
normalized compression distance
- PCP:
-
pitch class profile
- RCD:
-
radial convergence diagram
- SI-PLCA:
-
shift-invariant probabilistic latent component analysis
- SSM:
-
self-similarity matrix
- STFT:
-
short-term Fourier transform/short-time Fourier transform
References
M.A. Casey, R. Veltkamp, M. Goto, M. Leman, C. Rhodes, M. Slaney: Content-based music information retrieval: Current directions and future challenges, Proc. IEEE 96(4), 668–696 (2008)
M. Slaney: Web-scale multimedia analysis: Does content matter?, Multimed. IEEE 18(2), 12–15 (2011)
H. Schenker: Der freie Satz (Universal, Vienna 1935)
A. Ockelford: Repetition in Music: Theoretical and Metatheoretical Perspectives (Ashgate, London 2005)
D. Huron: Sweet Anticipation: Music and the Psychology of Expectation (MIT Press, Cambridge 2006)
M.J. Bruderer, M. McKinney, A. Kohlrausch: Structural boundary perception in popular music. In: Proc. Int. Conf. Music Inf. Retr. (ISMIR), Victoria (2006) pp. 198–201
G. Peeters, E. Deruty: Is music structure annotation multi-dimensional? A proposal for robust local music annotation. In: Proc. 3rd Workshop Learn. Semant. Audio Signals, Graz (2009) pp. 75–90
The AHRC Research Centre for the History and Analysis of Recorded Music: Website of the Mazurka Project, http://www.mazurka.org.uk/
C.S. Sapp: Comparative analysis of multiple musical performances. In: Proc. Int. Conf. Music Inf. Retr. (ISMIR), Vienna (2007) pp. 497–500
C.S. Sapp: Hybrid numeric/rank similarity metrics. In: Proc. Int. Conf. Music Inf. Retr. (ISMIR), Philadelphia (2008) pp. 501–506
E. Pampalk: Computational Models of Music Similarity and Their Application to Music Information Retrieval, Ph.D. Thesis (Vienna University of Technology, Vienna 2006)
S. Essid: Classification Automatique des Signaux Audio-Fréquences: Reconnaissance des Instruments de Musique, Ph.D. Thesis (Université Pierre et Marie Curie, Paris 2005)
G. Peeters: A large set of audio features for sound description (similarity and classification) in the CUIDADO project, http://recherche.ircam.fr/anasyn/peeters/ARTICLES/Peeters_2003_cuidadoaudiofeatures.pdf (Ircam, Analyis/Synthesis Team, Paris 2004), version 1.0
A. Sheh, D.P.W. Ellis: Chord segmentation and recognition using EM-trained hidden Markov models. In: Proc. Int. Conf. Music Inf. Retr. (ISMIR), Baltimore (2003)
D.P.W. Ellis, G.E. Poliner: Identifying ‘cover songs’ with chroma features and dynamic programming beat tracking. In: Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), Honolulu (2007)
J. Serrà, E. Gómez, P. Herrera, X. Serra: Chroma binary similarity and local alignment applied to cover song identification, IEEE Trans. Audio Speech Lang. Process. 16, 1138–1151 (2008)
E. Gómez: Tonal Description of Music Audio Signals, Ph.D. Thesis (Universitat Pompeu Fabra, Barcelona 2006)
M. Mauch, K. Noland, S. Dixon: Using musical structure to enhance automatic chord transcription. In: Proc. Int. Conf. Music Inf. Retr. (ISMIR), Kobe (2009) pp. 231–236
M. Müller: Information Retrieval for Music and Motion (Springer, Berlin, Heidelberg 2007)
R.N. Shepard: Circularity in judgments of relative pitch, J. Acoust. Soc. Am. 36(12), 2346–2353 (1964)
T. Fujishima: Realtime chord recognition of musical sound: A system using common lisp music. In: Proc. ICMC, Beijing (1999) pp. 464–467
M. Mauch, S. Dixon: Approximate note transcription for the improved identification of difficult chords. In: Proc. 11th Int. Soc. Music Inf. Retr. Conf. (ISMIR), Utrecht (2010) pp. 135–140
M. Müller, S. Ewert: Towards timbre-invariant audio features for harmony-based music, IEEE Trans. Audio Speech Lang. Process. 18(3), 649–662 (2010)
I.T. Jolliffe: Principal Component Analysis (Springer, New York 2002)
N. Hu, R.B. Dannenberg, G. Tzanetakis: Polyphonic audio matching and alignment for music retrieval. In: Proc. IEEE Workshop Appl. Signal Process. Audio Acoust. (WASPAA), New Paltz (2003)
S. Ewert, M. Müller, P. Grosche: High resolution audio synchronization using chroma onset features. In: Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), Taipei (2009) pp. 1869–1872
C. Fremerey, F. Kurth, M. Müller, M. Clausen: A demonstration of the SyncPlayer system. In: Proc. 8th Int. Conf. Music Inf. Retr. (ISMIR), Vienna (2007) pp. 131–132
D. Damm, C. Fremerey, F. Kurth, M. Müller, M. Clausen: Multimodal presentation and browsing of music. In: Proc. 10th Int. Conf. Multimodal Interfaces (ICMI), Chania (2008) pp. 205–208
M. Müller, V. Konz, N. Jiang, Z. Zuo: A multi-perspective user interface for music signal analysis. In: Proc. Int. Computer Music Conf. (ICMC), Huddersfield (2011)
M. Goto: A chorus section detection method for musical audio signals and its application to a music listening station, IEEE Trans. Audio Speech Lang. Process. 14(5), 1783–1794 (2006)
J. Foote: Visualizing music and audio using self-similarity. In: Proc. ACM Int. Conf. Multimed., Orlando (1999) pp. 77–80
J. Foote: Automatic audio segmentation using a measure of audio novelty. In: Proc. IEEE Int. Conf. Multimed. Expo (ICME), New York (2000) pp. 452–455
G. Peeters: Sequence representation of music structure using higher-order similarity matrix and maximum-likelihood approach. In: Proc. Int. Conf. Music Inf. Retr. (ISMIR), Vienna (2007) pp. 35–40
M. Goto: A chorus-section detecting method for musical audio signals. In: Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), Hong Kong (2003) pp. 437–440
M.A. Bartsch, G.H. Wakefield: Audio thumbnailing of popular music using chroma-based representations, IEEE Trans. Multimed. 7(1), 96–104 (2005)
J. Paulus, M. Müller, A. Klapuri: Audio-based music structure analysis. In: Proc. 11th Int. Conf. Music Inf. Retr. (ISMIR), Utrecht (2010) pp. 625–636
N. Marwan, M.C. Romano, M. Thiel, J. Kurths: Recurrence plots for the analysis of complex systems, Phys. Rep. 438(5/6), 237–329 (2007)
G. Tzanetakis, P. Cook: Musical genre classification of audio signals, IEEE Trans. Speech Audio Process. 10(5), 293–302 (2002)
M. Slaney, M. Casey: Locality sensitive hashing for finding nearest neighbours, IEEE Signal Process. Mag. 25(2), 128–131 (2008)
J. Serrà, X. Serra, R.G. Andrzejak: Cross recurrence quantification for cover song identification, New J. Phys. 11(9), 093017 (2009)
T. Cho, J. Forsyth, L. Kang, J.P. Bello: Time-varying delay effects based on recurrence plots. In: Proc. 14th Int. Conf. Digit. Audio Eff. (DAFx), Paris (2011)
M. Müller, F. Kurth: Enhancing similarity matrices for music audio analysis. In: Proc. 32nd Int. Conf. Acoust. Speech Signal Process. (ICASSP), Toulouse (2006) pp. 437–440
M. Müller, M. Clausen: Transposition-invariant self-similarity matrices. In: Proc. 8th Int. Conf. Music Inf. Retr. (ISMIR), Vienna (2007) pp. 47–50
R.B. Dannenberg, M. Goto: Music structure analysis from acoustic signals. In: Handbook of Signal Processing in Acoustics, Vol. 1, ed. by D. Havelock, S. Kuwano, M. Vorländer (Springer, New York 2008) pp. 305–331
T. Izumitani, K. Kashino: A robust musical audio search method based on diagonal dynamic programming matching of self-similarity matrices. In: Proc. 9th Int. Conf. Music Inf. Retr. (ISMIR), Philadelphia (2008) pp. 609–613
J.P. Bello: Measuring structural similarity in music, IEEE Trans. Audio Speech Lang. Process. 19(7), 2013–2025 (2011)
W. Xie, N.V. Sahinidis: A Branch-and-reduce algorithm for the contact map overlap problem, Res. Comput. Biol. (RECOMB 2006), Lect. Notes Bioinform. 3909, 516–529 (2006)
N. Krasnogor, D.A. Pelta: Measuring the similarity of protein structures by means of the universal similarity metric, Bioinformatics 20(7), 1015–1021 (2004)
J.P. Bello: Grouping recorded music by structural similarity. In: Proc. Int. Conf. Music Inf. Retr. (ISMIR), Kobe (2009)
I. Borg, P. Groenen: Modern Multidimensional Scaling (Springer, New York 1997)
P. Toiviainen: Visualization of tonal content with self-organizing maps and self-similarity matrices, Comput. Entertain. 3(4), 1–10 (2005)
K.W. Church, J.I. Helfman: Dotplot: A program for exploring self-similarity in millions of lines for text and code, J. Am. Stat. Assoc., Inst. Math. Stat. Interface Found. North Am. 2(2), 153–174 (1993)
E.L.L. Sonnhammer, J.C. Wootton: Dynamic contact maps of protein structures, J. Mol. Graph. Modell. 16(33), 1–5 (1998)
M. Lima: VC blog on Radial Convergence, http://www.visualcomplexity.com/vc/blog/?p=876 (2011)
M.I. Krzywinski, J.E. Schein, I. Birol, J. Connors, R. Gascoyne, D. Horsman, S.J. Jones, M.A. Marra: Circos: An information aesthetic for comparative genomics, Genome Res. 19(9), 1639–1645 (2009)
R.J. Weiss, J.P. Bello: Identifying repeated patterns in music using sparse convolutive non-negative matrix factorization. In: Proc. Int. Conf. Music Inf. Retr. (ISMIR), Utrecht (2010) pp. 123–128
R.J. Weiss, J.P. Bello: Unsupervised discovery of temporal structure in music, IEEE J. Sel. Top. Signal Process. 5(6), 1240–1251 (2011)
P. Grosche, M. Müller, C.S. Sapp: What makes beat tracking difficult? A case study on Chopin Mazurkas. In: Proc. 11th Int. Conf. Music Inf. Retr. (ISMIR), Utrecht (2010) pp. 649–654
Acknowledgements
This material is based upon work supported by the National Science Foundation, under grant IIS-0844654, and the Cluster of Excellence on Multimodal Computing and Interaction at Saarland University. The authors would like to thank Craig Sapp for kindly providing access to the Mazurka dataset and beat annotations.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Bello, J.P., Grosche, P., Müller, M., Weiss, R. (2018). Content-Based Methods for Knowledge Discovery in Music. In: Bader, R. (eds) Springer Handbook of Systematic Musicology. Springer Handbooks. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-55004-5_39
Download citation
DOI: https://doi.org/10.1007/978-3-662-55004-5_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-55002-1
Online ISBN: 978-3-662-55004-5
eBook Packages: EngineeringEngineering (R0)