Classification of Bird Sounds Using Codebook Features

Labao, Alfonso B.; Clutario, Mark A.; Naval, Prospero C.

doi:10.1007/978-3-319-75417-8_21

Alfonso B. Labao¹⁸,
Mark A. Clutario¹⁸ &
Prospero C. Naval Jr.¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10751))

Included in the following conference series:

Asian Conference on Intelligent Information and Database Systems

3107 Accesses
3 Citations

Abstract

We propose a one-step codebook of frequency based-features to classify bird sounds given Random Forest and Support Vector Machine classifiers. The dataset consists of bird sounds from seventeen (17) bird species, with strong similarities in half of the sounds for a human listener. The codebook acts as a global dictionary that extends extracted sound features in one step from raw audio files and creates clusters to form a high-dimensional feature probability distribution. The one-step codebook approach is compared with other traditional audio features - resulting in six different feature sets. Results indicate that using simple mean frequency and bandwidth or even their multi-modal histogram versions are not accurate enough, performing below 50% if applied to larger 17 class datasets. Accuracies increase if the audio signal’s spectral data is transformed to MFCC. The codebook approach on MFCC features with a Random Forest classifier performs best with an accuracy of 93.62%, and with good prediction results for almost all classes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Adavanne, S., Parascandolo, G., Drossos, K., Virtanen, T., et al.: Convolutional recurrent neural networks for bird audio detection. arXiv preprint arXiv:1703.02317 (2017)
Bravo, C.J.C., Berríos, R.Á., Aide, T.M.: Species-specific audio detection: a comparison of three template-based classification algorithms using random forests. Technical report, PeerJ Preprints (2017)
Google Scholar
Briggs, F., Raich, R., Fern, X.Z.: Audio classification of bird species: a statistical manifold approach. In: 2009 Ninth IEEE International Conference on Data Mining, pp. 51–60. IEEE (2009)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)
Google Scholar
Chen, Z., Maher, R.C.: Semi-automatic classification of bird vocalizations using spectral peak tracks. J. Acoust. Soc. Am. 120(5), 2974–2984 (2006)
Article Google Scholar
Coates, A., Ng, A.Y.: Learning feature representations with k-means. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 561–580. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_30
Chapter Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995)
MATH Google Scholar
Dugan, P., Cukierski, W., Shiu, Y., Rahaman, A., Clark, C.: Kaggle competition. Cornell Univerity, The ICML (2013)
Google Scholar
Fagerlund, S.: Automatic recognition of bird species by their sounds. Ph.D. thesis, Helsinki University of Technology (2004)
Google Scholar
Harma, A.: Automatic identification of bird species based on sinusoidal modeling of syllables. In: 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP 2003), vol. 5, pp. V–545. IEEE (2003)
Google Scholar
Ishwaran, H., Kogalur, U.B., Blackstone, E.H., Lauer, M.S.: Random survival forests. Ann. Appl. Stat. 841–860 (2008)
Google Scholar
Jain, A.K.: Data clustering: 50 years beyond k-means. Pattern Recogn. Lett. 31(8), 651–666 (2010)
Article Google Scholar
Lee, C.H., Lee, Y.K., Huang, R.Z.: Automatic recognition of bird songs using cepstral coefficients. J. Inf. Technol. Appl. 1(1), 17–23 (2006)
Google Scholar
Leisch, F., Dimitriadou, E.: flexclust: Flexible cluster algorithms. http://cran.r-project.org/package=flexclust, R package version 1.3-1. Cited on p. 194
Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)
Google Scholar
Ludeña-Choez, J., Quispe-Soncco, R., Gallardo-Antolín, A.: Bird sound spectrogram decomposition through non-negative matrix factorization for the acoustic classification of bird species. PLoS ONE 12(6), e0179403 (2017)
Article Google Scholar
Qian, K., Zhang, Z., Baird, A., Schuller, B.: Active learning for bird sounds classification. Acta Acust. United Acust. 103(3), 361–364 (2017)
Article Google Scholar
Sevilla, A., Bessonne, L., Glotin, H.: Audio bird classification with inception-v4 extended with time and time-frequency attention mechanisms. In: Working Notes of CLEF 2017 (2017)
Google Scholar
Stowell, D., Plumbley, M.D.: Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning. PeerJ 2, e488 (2014)
Article Google Scholar
Sueur, J., Aubin, T., Simonis, C.: Equipment review: seewave, a free modular tool for sound analysis and synthesis. Bioacoustics 18(2), 213–226 (2008)
Article Google Scholar
Zheng, F., Zhang, G., Song, Z.: Comparison of different implementations of MFCC. J. Comput. Sci. Technol. 16(6), 582–589 (2001)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Computer Vision and Machine Intelligence Group, Department of Computer Science, College of Engineering, University of the Philippines Diliman, Quezon City, Philippines
Alfonso B. Labao, Mark A. Clutario & Prospero C. Naval Jr.

Authors

Alfonso B. Labao
View author publications
You can also search for this author in PubMed Google Scholar
Mark A. Clutario
View author publications
You can also search for this author in PubMed Google Scholar
Prospero C. Naval Jr.
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Prospero C. Naval Jr. .

Editor information

Editors and Affiliations

Wrocław University of Science and Technology, Wrocław, Poland
Ngoc Thanh Nguyen
Quang Binh University, Dong Hoi City, Vietnam
Duong Hung Hoang
National University of Kaohsiung, Kaohsiung, Taiwan
Tzung-Pei Hong
Rutgers University, Piscataway, New Jersey, USA
Hoang Pham
Wrocław University of Science and Technology, Wrocław, Poland
Bogdan Trawiński

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Labao, A.B., Clutario, M.A., Naval, P.C. (2018). Classification of Bird Sounds Using Codebook Features. In: Nguyen, N., Hoang, D., Hong, TP., Pham, H., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2018. Lecture Notes in Computer Science(), vol 10751. Springer, Cham. https://doi.org/10.1007/978-3-319-75417-8_21

Download citation

DOI: https://doi.org/10.1007/978-3-319-75417-8_21
Published: 14 February 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-75416-1
Online ISBN: 978-3-319-75417-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics