Music Genre Classification Using a Gradient-Based Local Texture Descriptor

  • Faisal AhmedEmail author
  • Padma Polash Paul
  • Marina Gavrilova
Conference paper
Part of the Smart Innovation, Systems and Technologies book series (SIST, volume 57)


With the increasing popularity and availability of online music databases that store vast collections of music, automated classification of music genre has attracted significant attention for the management of such large-scale databases. This paper presents a new music genre classification method that utilizes gradient-based texture analysis of the spectrograms constructed from the audio signals. We propose to use gradient directional pattern (GDP)—a robust local texture descriptor that exploits the gradient directional information to encode the local texture properties of an image. The proposed method first computes spectrograms from the audio signals and then applies the GDP operator to construct the feature descriptors that represent micro-level texture details of the spectrograms. We use a support vector machine (SVM) for the classification task. The effectiveness of the proposed method is evaluated using the GTZAN genre collection music database. Our experiments show promising results for the proposed GDP-based spectrogram texture analysis, as compared against some other existing music genre classification methods.


Music genre classification Local texture analysis Spectrogram Gradient directional pattern (GDP) 



The authors would like to thank NSERC Discovery Grant Project 1028463, NSERC Engage, AITF, and MITACS Accelerate for partial support of this project.


  1. 1.
    Ahmed, F.: Gradient directional pattern: a robust feature descriptor for facial expression recognition. IET Electron. Lett. 48(19), 1203–1204 (2012)CrossRefGoogle Scholar
  2. 2.
    Ahmed, F., Kabir, M.H.: Directional ternary pattern (dtp) for facial expression recognition. In: IEEE International Conference on Consumer Electronics, pp. 265–266 (2012)Google Scholar
  3. 3.
    Ahmed, F., Paul, P., Wang, P., Gavrilova, M.: Gender classification from face images based on gradient directional pattern (gdp). In: Internatonal Conference on Computational Science and Its Applications, vol. LNCS 9156, pp. 233–243 (2015)CrossRefGoogle Scholar
  4. 4.
    Costa, Y., Oliveira, L., Koerich, A., Gouyon, F.: Music genre recognition using gabor filters and lpq texture descriptors. In: Iberoamerican Congress on Pattern Recognition, vol. LNCS 8259, pp. 67–74 (2013)CrossRefGoogle Scholar
  5. 5.
    Costa, Y., Oliveira, L., Koerich, A., Gouyon, F.: Music genre recognition using spectrograms. In: International Conference on Systems, Signals and Image Processing, pp. 151–154 (2011)Google Scholar
  6. 6.
    Costa, Y., Oliveira, L., Koerich, A., Gouyon, F., Martins, J.: Music genre classification using lbp textural features. Sig. Process. 92, 2723–2737 (2012)CrossRefGoogle Scholar
  7. 7.
    Dannenberg, R., Thom, B., Watson, D.: A machine learning approach to musical style recognition. In: International Computer Music Conference (1997)Google Scholar
  8. 8.
    Ezzaidi, H., Rouat, J.: Automatic musical genre classification using divergence and average information measures. In: Research report of the world academy of science, engineering and technology (2006)Google Scholar
  9. 9.
    Hermansky, H.: Perceptual linear predictive (plp) analysis of speech. J. Acoust. Soc. Amer. 87(4), 1738–1752 (1990)CrossRefGoogle Scholar
  10. 10.
    Jabid, T., Kabir, M.H., Chae, O.: Robust facial expression recognition based on local directional pattern. ETRI J. 32(5), 784–794 (2010)CrossRefGoogle Scholar
  11. 11.
    Li, T., M, M.O., Li, Q.: A comparative study on content-based music genre classification. In: international ACM SI-GIR conference on research and development in information retrieval, pp. 282–289 (2003)Google Scholar
  12. 12.
    Lidy, T., Rauber, A.: Evaluation of feature extractors and psychoacoustic transformations for music genre classification. In: International Conference on Music Information Retrieval, pp. 71–80 (2005)Google Scholar
  13. 13.
    Lidy, T., Silla, C., Cornelis, O., Gouyon, F., Rauber, A., Kaestner, C., Koerich, A.: On the suitability of state-of-the-art music information retrieval methods for analyzing, categorizing and accessing non-western and ethnic music collections. Signal 90, 1032–1048 (2010)zbMATHGoogle Scholar
  14. 14.
    McKay, C., Fujinaga, I.: Musical genre classification: is it worth pursuing and how can it be improved? In: International Conference on Music Information Retrieval, pp. 101–106 (2006)Google Scholar
  15. 15.
    Neammalai, P., Phimoltares, S., Lursinsap, C.: Speech and music classification using hybrid form of spectrogram and fourier transformation. In: Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, pp. 1–6 (2014)Google Scholar
  16. 16.
    Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24(7), 971–987 (2002)CrossRefGoogle Scholar
  17. 17.
    Silla, C.N., Koerich, A.L., Kaestner, C.: Feature selection approach for automatic music genre classification. Int. J. Semant. Comput. 3(2), 183–208 (2009)CrossRefGoogle Scholar
  18. 18.
    Tan, X., Triggs, B.: Enhanced local texture feature sets for face recognition under difficult lighting conditions. In: IEEE International Workshop on Analysis and Modeling of Faces and Gestures, LNCS vol. 4778, pp. 168–182 (2007)Google Scholar
  19. 19.
    Tzanetakis, G., Cook, P.: Musical genre classification of audio signals. IEEE Trans. Speech Audio Process. 10(5), 293–302 (2002)CrossRefGoogle Scholar
  20. 20.
    Wu, H., Zhang, M.: Gabor-lbp features and combined classifiers for music genre classification. In: International Conference on Computer and Information Application, pp. 419–422 (2012)Google Scholar
  21. 21.
    Zhao, S., Gao, Y., Zhang, B.: Sobel-lbp. In: IEEE International Conference on Image Processing, pp. 2144–2147 (2008)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (, which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Authors and Affiliations

  • Faisal Ahmed
    • 1
    Email author
  • Padma Polash Paul
    • 1
  • Marina Gavrilova
    • 1
  1. 1.Department of Computer ScienceUniversity of CalgaryCalgaryCanada

Personalised recommendations