A Bag-of-Tones Model with MFCC Features for Musical Genre Classification
Musical genres are categorical labels created by humans to characterize pieces of music. These labels may be highly subjective but typically are related to the instrumentation, rhythmic structure, and harmonic content of the music. In this paper, we propose a model for music genre classification. The new model is referred to as the bag-of-tones (BOT) model which follows the conceptually similar idea of the bag-of-words (BOW) model in natural language processing and the bag-of-feature (BOF) model in image processing. The basic low-level music features such as Mel-frequency cepstral coefficients (MFCC) are clustered into a set of codewords referred to as “tones”. By using such a model, each piece of music can be represented by a new feature vector of distribution on tones. Classical machine learning models such as support vector machines (SVM) can be applied for genre classification. The model is tested using two datasets. We found that the polynomial kernel function has the best performance in the SVM classification. By comparing to the previous work, we found the new proposed model outperform classical models on a given benchmark dataset. In general, this model can be used to structure the large collections of music available on the Web. It can play an important role in automatic digital music categorization and retrieval.
Keywordsbag-of-words bag-of-tones MFCC musical genre classification
Unable to display preview. Download preview PDF.
- 1.Dannenberg, R.B., Thom, B., Watson, D.: A machine learning approach to musical style recognition. In: Proc. International Computer Music Conference (1997)Google Scholar
- 2.Chai, W., Barry, V.: Folk music classification using hidden Markov models. In: Proceedings of International Conference on Artificial Intelligence, vol. 6 (2001)Google Scholar
- 3.Shan, M.K., Kuo, F.-F.: Music style mining and classification by melody. IEICE Transactions on Information and Systems 86(3), 655–659 (2003)Google Scholar
- 4.Matityaho, B., Furst, M.: Neural network based model for classification of music type. In: Eighteenth Convention of Electrical and Electronics Engineers in Israel. IEEE (1995)Google Scholar
- 6.Pye, D.: Content-based methods for the management of digital music. In: Proceedings of the 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 6. IEEE (2000)Google Scholar
- 7.Jiang, D.N., Lu, L., Zhang, H.J., Tao, J.-H.: Music type classification by spectral contrast feature. In: Proceedings of the IEEE International Conference on Multimedia and Expo, ICME 2002, vol. 1. IEEE (2002)Google Scholar
- 9.Logan, B.: Mel frequency cepstral coefficients for music modeling. In: MUSIC IR (2000)Google Scholar
- 13.Yuan, X., Yu, J., Qin, Z., Wan, T.: A bag-of-features model with integrated SIFT-LBP features for content-based image retrieval. In: Proceedings of the International Conference on Image Processing, pp. 1061–1064 (2011)Google Scholar
- 15.Lie, L., Jiang, H., Zhang, H.: A robust audio classification and segmentation method. In: Proceedings of the Ninth ACM International Conference on Multimedia (2001)Google Scholar