Abstract
The “winning” system in the 2013 MIREX Latin Genre Classification Task was a deep neural network trained with simple features. An explanation for its winning performance has yet to be found. In previous work, we built similar systems using the BALLROOM music dataset, and found their performances to be greatly affected by slightly changing the tempo of the music of a test recording. In the MIREX task, however, systems are trained and tested using the Latin Music Dataset (LMD), which is 4.5 times larger than BALLROOM, and which does not seem to show as strong a relationship between tempo and label as BALLROOM. In this paper, we reproduce the “winning” deep learning system using LMD, and measure the effects of time dilation on its performance. We find that tempo changes of at most \(\pm 6\,\%\) greatly diminish and improve its performance. Interpreted with the low-level nature of the input features, this supports the conclusion that the system is exploiting some low-level absolute time characteristics to reproduce ground truth in LMD.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
- 3.
The fold composition in the MIREX task is problematic. Table 2 shows folds 1 and 2 are missing examples of 2 classes, and fold 1 has only one example in another.
- 4.
- 5.
Audition this table at http://www.eecs.qmul.ac.uk/~sturm/research/DeSPerFtable2/exp.html.
References
Aucouturier, J.J., Pachet, F.: Representing music genre: A state of the art. J. New Music Res. 32(1), 83–93 (2003)
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. Adv. Neural Inf. Process. Syst. 19, 153 (2007)
Deng, L., Yu, D.: Deep Learning: Methods and Applications. Now Publishers, Hanover (2014)
Dixon, S., Gouyon, F., Widmer, G.: Towards characterisation of music via rhythmic patterns. In: Proceedings of the ISMIR, pp. 509–517 (2004)
Esparza, T., Bello, J., Humphrey, E.: From genre classification to rhythm similarity: Computational and musicological insights. J. New Music Res. 44, 39–57 (2014)
Frow, J.: Genre. Routledge, New York (2005)
Gouyon, F., Dixon, S., Pampalk, E., Widmer, G.: Evaluating rhythmic descriptors for musical genre classification. In: Proceedings of the Audio Engineering Society Conference, pp. 196–204 (2004)
Hinton, G., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Humphrey, E., Bello, J., LeCun, Y.: Feature learning and deep architectures: New directions for music informatics. J. Intell. Info. Syst. 41(3), 461–481 (2013)
Pfungst, O.: Clever Hans (The horse of Mr. Von Osten): A Contribution to Experimental Animal and Human Psychology. Henry Holt, New York (1911)
Pikrakis, A.: A deep learning approach to rhythm modeling with applications. In: Proceedings of International Workshop Machine Learning and Music (2013)
Saxe, A.M., McClelland, J.L., Ganguli, S.: Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. CoRR abs/1312.6120 (2013)
Silla, C.N., Koerich, A.L., Kaestner, C.A.A.: The Latin music database. In: Proceedings of ISMIR (2008)
Slaney, M.: Auditory toolbox. Technical report, Interval Research Corporation (1998)
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)
Sturm, B.L.: Classification accuracy is not enough: On the evaluation of music genre recognition systems. J. Intell. Info. Syst. 41(3), 371–406 (2013)
Sturm, B.L.: A simple method to determine if a music information retrieval system is a “horse”. IEEE Trans. Multimedia 16(6), 1636–1644 (2014)
Sturm, B.L.: The state of the art ten years after a state of the art: Future research in music information retrieval. J. New Music Res. 43(2), 147–172 (2014)
Sturm, B.L., Kereliuk, C., Pikrakis, A.: A closer look at deep learning neural networks with low-level spectral periodicity features. In: Proceedings of the International Workshop on Cognitive Information Processing (2014)
Acknowledgments
We greatly appreciate Aggelos Pikrakis for making his code available for analysis and testing. CK and JL were supported in part by the Danish Council for Strategic Research of the Danish Agency for Science Technology and Innovation under the CoSound project, case number 11-115328. This publication only reflects the authors’ views.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Sturm, B.L., Kereliuk, C., Larsen, J. (2015). ¿El Caballo Viejo? Latin Genre Recognition with Deep Learning and Spectral Periodicity. In: Collins, T., Meredith, D., Volk, A. (eds) Mathematics and Computation in Music. MCM 2015. Lecture Notes in Computer Science(), vol 9110. Springer, Cham. https://doi.org/10.1007/978-3-319-20603-5_34
Download citation
DOI: https://doi.org/10.1007/978-3-319-20603-5_34
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-20602-8
Online ISBN: 978-3-319-20603-5
eBook Packages: Computer ScienceComputer Science (R0)