Abstract
We propose a one-step codebook of frequency based-features to classify bird sounds given Random Forest and Support Vector Machine classifiers. The dataset consists of bird sounds from seventeen (17) bird species, with strong similarities in half of the sounds for a human listener. The codebook acts as a global dictionary that extends extracted sound features in one step from raw audio files and creates clusters to form a high-dimensional feature probability distribution. The one-step codebook approach is compared with other traditional audio features - resulting in six different feature sets. Results indicate that using simple mean frequency and bandwidth or even their multi-modal histogram versions are not accurate enough, performing below 50% if applied to larger 17 class datasets. Accuracies increase if the audio signal’s spectral data is transformed to MFCC. The codebook approach on MFCC features with a Random Forest classifier performs best with an accuracy of 93.62%, and with good prediction results for almost all classes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adavanne, S., Parascandolo, G., Drossos, K., Virtanen, T., et al.: Convolutional recurrent neural networks for bird audio detection. arXiv preprint arXiv:1703.02317 (2017)
Bravo, C.J.C., Berríos, R.Á., Aide, T.M.: Species-specific audio detection: a comparison of three template-based classification algorithms using random forests. Technical report, PeerJ Preprints (2017)
Briggs, F., Raich, R., Fern, X.Z.: Audio classification of bird species: a statistical manifold approach. In: 2009 Ninth IEEE International Conference on Data Mining, pp. 51–60. IEEE (2009)
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)
Chen, Z., Maher, R.C.: Semi-automatic classification of bird vocalizations using spectral peak tracks. J. Acoust. Soc. Am. 120(5), 2974–2984 (2006)
Coates, A., Ng, A.Y.: Learning feature representations with k-means. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 561–580. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_30
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
Dugan, P., Cukierski, W., Shiu, Y., Rahaman, A., Clark, C.: Kaggle competition. Cornell Univerity, The ICML (2013)
Fagerlund, S.: Automatic recognition of bird species by their sounds. Ph.D. thesis, Helsinki University of Technology (2004)
Harma, A.: Automatic identification of bird species based on sinusoidal modeling of syllables. In: 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP 2003), vol. 5, pp. V–545. IEEE (2003)
Ishwaran, H., Kogalur, U.B., Blackstone, E.H., Lauer, M.S.: Random survival forests. Ann. Appl. Stat. 841–860 (2008)
Jain, A.K.: Data clustering: 50 years beyond k-means. Pattern Recogn. Lett. 31(8), 651–666 (2010)
Lee, C.H., Lee, Y.K., Huang, R.Z.: Automatic recognition of bird songs using cepstral coefficients. J. Inf. Technol. Appl. 1(1), 17–23 (2006)
Leisch, F., Dimitriadou, E.: flexclust: Flexible cluster algorithms. http://cran.r-project.org/package=flexclust, R package version 1.3-1. Cited on p. 194
Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)
Ludeña-Choez, J., Quispe-Soncco, R., Gallardo-Antolín, A.: Bird sound spectrogram decomposition through non-negative matrix factorization for the acoustic classification of bird species. PLoS ONE 12(6), e0179403 (2017)
Qian, K., Zhang, Z., Baird, A., Schuller, B.: Active learning for bird sounds classification. Acta Acust. United Acust. 103(3), 361–364 (2017)
Sevilla, A., Bessonne, L., Glotin, H.: Audio bird classification with inception-v4 extended with time and time-frequency attention mechanisms. In: Working Notes of CLEF 2017 (2017)
Stowell, D., Plumbley, M.D.: Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning. PeerJ 2, e488 (2014)
Sueur, J., Aubin, T., Simonis, C.: Equipment review: seewave, a free modular tool for sound analysis and synthesis. Bioacoustics 18(2), 213–226 (2008)
Zheng, F., Zhang, G., Song, Z.: Comparison of different implementations of MFCC. J. Comput. Sci. Technol. 16(6), 582–589 (2001)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Labao, A.B., Clutario, M.A., Naval, P.C. (2018). Classification of Bird Sounds Using Codebook Features. In: Nguyen, N., Hoang, D., Hong, TP., Pham, H., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2018. Lecture Notes in Computer Science(), vol 10751. Springer, Cham. https://doi.org/10.1007/978-3-319-75417-8_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-75417-8_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75416-1
Online ISBN: 978-3-319-75417-8
eBook Packages: Computer ScienceComputer Science (R0)