Skip to main content

Capturing the Temporal Domain in Echonest Features for Improved Classification Effectiveness

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8382))

Abstract

This paper proposes Temporal Echonest Features to harness the information available from the beat-aligned vector sequences of the features provided by The Echo Nest. Rather than aggregating them via simple averaging approaches, the statistics of temporal variations are analyzed and used to represent the audio content. We evaluate the performance on four traditional music genre classification test collections and compare them to state of the art audio descriptors. Experiments reveal, that the exploitation of temporal variability from beat-aligned vector sequences and combinations of different descriptors leads to an improvement of classification accuracy. Comparing the results of Temporal Echonest Features to those of approved conventional audio descriptors used as benchmarks, these approaches perform well, often significantly outperforming their predecessors, and can be effectively used for large scale music genre classification.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    http://www.amazon.com/music

  2. 2.

    http://www.last.fm

  3. 3.

    http://www.spotify.com/

  4. 4.

    http://labrosa.ee.columbia.edu

  5. 5.

    http://the.echonest.com/

  6. 6.

    http://us.7digital.com/

  7. 7.

    http://developer.echonest.com

  8. 8.

    http://musicbrainz.org

  9. 9.

    http://www.playme.com

  10. 10.

    https://github.com/echonest/pyechonest/

  11. 11.

    https://github.com/tb2332/MSongsDB/tree/master/PythonSrc

  12. 12.

    http://www.hdfgroup.org/HDF5/

  13. 13.

    http://marsyas.info

  14. 14.

    http://www.ifs.tuwien.ac.at/mir/downloads.html

References

  1. Bertin-Mahieux, T., Ellis, D.P.W., Whitman, B., Lamere, P.: The million song dataset. In: Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR 2011) (2011)

    Google Scholar 

  2. Cano, P., Gómez, E., Gouyon, F., Herrera, P., Koppenberger, M., Ong, B., Serra, X., Streich, S., Wack, N.: ISMIR 2004 audio description contest. Technical report (2006)

    Google Scholar 

  3. Dieleman, S., Schrauwen, B.: Audio-based music classification with a pretrained convolutional network. In: Proceedings of the 12th International Conference on Music Information Retrieval (ISMIR 2011) (2011)

    Google Scholar 

  4. Ellis, D.P.W.: Classifying music audio with timbral and chroma features. In: Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR 2007) (2007)

    Google Scholar 

  5. Fu, Z., Lu, G., Ting, K.M., Zhang, D.: A survey of audio-based music classification and annotation. IEEE Trans. Multimed. 13(2), 303–319 (2011)

    Article  Google Scholar 

  6. Hall, Mark, Frank, Eibe, Holmes, Geoffrey, Pfahringer, Bernhard, Reutemann, Peter, Witten, Ian H.: The WEKA data mining software: an update. SIGKDD Explor. 11(1), 10–18 (2009)

    Article  Google Scholar 

  7. Jehan, T., DesRoches, D.: Analyzer documentation (analyzer version 3.08). Website (2011). http://developer.echonest.com/docs/v4/_static/AnalyzeDocumentation.pdf. Accessed 17 Apr 2012

  8. Lidy, T., Mayer, R., Rauber, A., Pertusa, A., Inesta, J.M.: A cartesian ensemble of feature subspace classifiers for music categorization. In: Proceedings of the 11th International Conference on Music Information Retrieval (ISMIR 2010) (2010)

    Google Scholar 

  9. Lidy, T., Rauber, A.: In: Proceedings of the 6th International Society for Music Information Retrieval Conference (ISMIR 2005) (2005)

    Google Scholar 

  10. Lidy, T., Silla Jr., C.N., Cornelis, O., Gouyon, F., Rauber, A., Kaestner, Caa, Koerich, A.L.: On the suitability of state-of-the-art music information retrieval methods for analyzing, categorizing and accessing non-Western and ethnic music collections. Signal Process. 90(4), 1032–1048 (2010)

    Article  MATH  Google Scholar 

  11. Logan, B.: Mel frequency cepstral coefficients for music modeling. In: International Symposium on Music Information Retrieval (2000)

    Google Scholar 

  12. McKay, C., Fujinaga, I.: Musical genre classification: is it worth pursuing and how can it be improved. In: Proceedings of the 7th International Conference on Music Information Retrieval (ISMIR 2006), pp.101–106 (2006)

    Google Scholar 

  13. McKay, C., Fujinaga, I.: jMIR: tools for automatic music classification. In: Proceedings of the International Computer Music Conference, pp. 65–68 (2009)

    Google Scholar 

  14. Pampalk, E., Rauber, A., Merkl, D.: Content-based organization and visualization of music archives. In: Proceedings of the 10th ACM International Conference on Multimedia, p. 570 (2002)

    Google Scholar 

  15. Rauber, A., Pampalk, E., Merkl, D.: The SOM-enhanced JukeBox: organization and visualization of music collections based on perceptual models. J. New Music Res. 32(2), 193–210 (2003)

    Article  Google Scholar 

  16. Schindler, A., Mayer, R., Rauber, A.: Facilitating comprehensive benchmarking experiments on the million song dataset. In: Proceedings of the 13th International Conference on Music Information Retrieval (ISMIR 2012) (2012)

    Google Scholar 

  17. Silla, Jr., C.N., Koerich, A.L., Catholic, P., Kaestner, C.A.A.: The Latin music database. In: Proceedings of the 9th International Conference of Music Information Retrieval, p. 451. Lulu. com (2008)

    Google Scholar 

  18. Tzanetakis, G.: Manipulation, analysis and retrieval systems for audio signals. Ph.D. thesis (2002)

    Google Scholar 

  19. Tzanetakis, George, Cook, Perry: Marsyas: a framework for audio analysis. Organised Sound 4(3), 169–175 (2000)

    Article  Google Scholar 

  20. Tzanetakis, George, Cook, Perry: Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10(5), 293–302 (2002)

    Article  Google Scholar 

  21. Witten, I.H., Frank, E., Trigg, L., Hall, M., Holmes, G., Cunningham, S.J.: Weka: practical machine learning tools and techniques with Java implementations (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alexander Schindler .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Schindler, A., Rauber, A. (2014). Capturing the Temporal Domain in Echonest Features for Improved Classification Effectiveness. In: Nürnberger, A., Stober, S., Larsen, B., Detyniecki, M. (eds) Adaptive Multimedia Retrieval: Semantics, Context, and Adaptation. AMR 2012. Lecture Notes in Computer Science(), vol 8382. Springer, Cham. https://doi.org/10.1007/978-3-319-12093-5_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-12093-5_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-12092-8

  • Online ISBN: 978-3-319-12093-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics