Skip to main content

¿El Caballo Viejo? Latin Genre Recognition with Deep Learning and Spectral Periodicity

  • Conference paper
  • First Online:
Mathematics and Computation in Music (MCM 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9110))

Included in the following conference series:

Abstract

The “winning” system in the 2013 MIREX Latin Genre Classification Task was a deep neural network trained with simple features. An explanation for its winning performance has yet to be found. In previous work, we built similar systems using the BALLROOM music dataset, and found their performances to be greatly affected by slightly changing the tempo of the music of a test recording. In the MIREX task, however, systems are trained and tested using the Latin Music Dataset (LMD), which is 4.5 times larger than BALLROOM, and which does not seem to show as strong a relationship between tempo and label as BALLROOM. In this paper, we reproduce the “winning” deep learning system using LMD, and measure the effects of time dilation on its performance. We find that tempo changes of at most \(\pm 6\,\%\) greatly diminish and improve its performance. Interpreted with the low-level nature of the input features, this supports the conclusion that the system is exploiting some low-level absolute time characteristics to reproduce ground truth in LMD.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://www.music-ir.org/nema_out/mirex2013/results/act/latin_report/summary.html.

  2. 2.

    http://www.music-ir.org/nema_out/mirex2013/results/act/latin_report/files.html.

  3. 3.

    The fold composition in the MIREX task is problematic. Table 2 shows folds 1 and 2 are missing examples of 2 classes, and fold 1 has only one example in another.

  4. 4.

    http://breakfastquay.com/rubberband/.

  5. 5.

    Audition this table at http://www.eecs.qmul.ac.uk/~sturm/research/DeSPerFtable2/exp.html.

References

  1. Aucouturier, J.J., Pachet, F.: Representing music genre: A state of the art. J. New Music Res. 32(1), 83–93 (2003)

    Article  Google Scholar 

  2. Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H.: Greedy layer-wise training of deep networks. Adv. Neural Inf. Process. Syst. 19, 153 (2007)

    Google Scholar 

  3. Deng, L., Yu, D.: Deep Learning: Methods and Applications. Now Publishers, Hanover (2014)

    MATH  Google Scholar 

  4. Dixon, S., Gouyon, F., Widmer, G.: Towards characterisation of music via rhythmic patterns. In: Proceedings of the ISMIR, pp. 509–517 (2004)

    Google Scholar 

  5. Esparza, T., Bello, J., Humphrey, E.: From genre classification to rhythm similarity: Computational and musicological insights. J. New Music Res. 44, 39–57 (2014)

    Article  Google Scholar 

  6. Frow, J.: Genre. Routledge, New York (2005)

    Google Scholar 

  7. Gouyon, F., Dixon, S., Pampalk, E., Widmer, G.: Evaluating rhythmic descriptors for musical genre classification. In: Proceedings of the Audio Engineering Society Conference, pp. 196–204 (2004)

    Google Scholar 

  8. Hinton, G., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  9. Humphrey, E., Bello, J., LeCun, Y.: Feature learning and deep architectures: New directions for music informatics. J. Intell. Info. Syst. 41(3), 461–481 (2013)

    Article  Google Scholar 

  10. Pfungst, O.: Clever Hans (The horse of Mr. Von Osten): A Contribution to Experimental Animal and Human Psychology. Henry Holt, New York (1911)

    Book  Google Scholar 

  11. Pikrakis, A.: A deep learning approach to rhythm modeling with applications. In: Proceedings of International Workshop Machine Learning and Music (2013)

    Google Scholar 

  12. Saxe, A.M., McClelland, J.L., Ganguli, S.: Exact solutions to the nonlinear dynamics of learning in deep linear neural networks. CoRR abs/1312.6120 (2013)

    Google Scholar 

  13. Silla, C.N., Koerich, A.L., Kaestner, C.A.A.: The Latin music database. In: Proceedings of ISMIR (2008)

    Google Scholar 

  14. Slaney, M.: Auditory toolbox. Technical report, Interval Research Corporation (1998)

    Google Scholar 

  15. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  16. Sturm, B.L.: Classification accuracy is not enough: On the evaluation of music genre recognition systems. J. Intell. Info. Syst. 41(3), 371–406 (2013)

    Article  Google Scholar 

  17. Sturm, B.L.: A simple method to determine if a music information retrieval system is a “horse”. IEEE Trans. Multimedia 16(6), 1636–1644 (2014)

    Article  Google Scholar 

  18. Sturm, B.L.: The state of the art ten years after a state of the art: Future research in music information retrieval. J. New Music Res. 43(2), 147–172 (2014)

    Article  Google Scholar 

  19. Sturm, B.L., Kereliuk, C., Pikrakis, A.: A closer look at deep learning neural networks with low-level spectral periodicity features. In: Proceedings of the International Workshop on Cognitive Information Processing (2014)

    Google Scholar 

Download references

Acknowledgments

We greatly appreciate Aggelos Pikrakis for making his code available for analysis and testing. CK and JL were supported in part by the Danish Council for Strategic Research of the Danish Agency for Science Technology and Innovation under the CoSound project, case number 11-115328. This publication only reflects the authors’ views.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bob L. Sturm .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Sturm, B.L., Kereliuk, C., Larsen, J. (2015). ¿El Caballo Viejo? Latin Genre Recognition with Deep Learning and Spectral Periodicity. In: Collins, T., Meredith, D., Volk, A. (eds) Mathematics and Computation in Music. MCM 2015. Lecture Notes in Computer Science(), vol 9110. Springer, Cham. https://doi.org/10.1007/978-3-319-20603-5_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-20603-5_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-20602-8

  • Online ISBN: 978-3-319-20603-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics