Automatic Birdsong Recognition with MFCC Based Syllable Feature Extraction

Chou, Chih-Hsun; Ko, Hui-Yu

doi:10.1007/978-3-642-23641-9_17

Chih-Hsun Chou¹⁹ &
Hui-Yu Ko¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6905))

Included in the following conference series:

International Conference on Ubiquitous Intelligence and Computing

1673 Accesses
3 Citations

Abstract

In this study, an automatic birdsong recognition system based on syllable features was developed. In this system, after syllable segmentation, three syllable features, namely mean, QI and QE, were computed from the MFCCs of each syllable aims at capturing variations in time as well as amplitude transitions of the MFCC sequences. With the advantages of the fuzzy c-mean (FCM) clustering algorithm and the linear discriminant analysis (LDA), the presented feature vector was used to construct an automatic birdsong recognition system applied to a birdsong database with 420 bird species.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Lee, S.M., Fang, S.H., Hung, J.W., Lee, L.S.: Improved MFCC feature extraction by PCA-optimized filter-bank for speech recognition. In: IEEE Workshop, Automatic Speech Recognition and Understanding, pp. 49–52 (2001)
Google Scholar
Lee, C.H., Hyun, D.H., Choi, E.S., Go, J.W., Lee, C.Y.: Optimizing feature extraction for speech recognition. IEEE Transactions on Speech and Audio Processing 11, 80–87 (2003)
Article Google Scholar
Skowronski, M.D., Harris, J.G.: Increased MFCC filter bandwidth for noise-robust phoneme recognition. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 801–804 (2002)
Google Scholar
Skowronski, M.D., Harris, J.G.: Improving the filter bank of a classic speech feature extraction algorithm. Circuits and Systems 4, 281–284 (2003)
Google Scholar
Bou-Ghazale, S.E., Hansen, J.H.L.: A comparative study of traditional and newly proposed features for recognition of speech under stress. IEEE Transactions on Speech and Audio Processing 8, 429–442 (2000)
Article Google Scholar
Ricotti, L.P.: Multitapering and a wavelet variant of MFCC in speech recognition. In: IEE Proceedings - Vision, Image and Signal Processing, pp. 29–35 (February 2005)
Google Scholar
Hung, W.W., Wang, H.C.: On the use of weighted filter bank analysis for the derivation of robust MFCCs. IEEE Signal Processing Letters 8, 70–73 (2001)
Article Google Scholar
Kwan, C., et al.: An automated acoustic system to monitor and classify birds. EURASIP Journal on Applied Signal Processing, Article ID 96706, 1–19 (2006)
Google Scholar
Somervuo, P., Harma, A., Fagerlund, S.: Parametric Representations of Bird Sounds for Automatic Species Recognition. IEEE Transactions on Audio, Speech and Language Processing 14, 2252–2263 (2006)
Article Google Scholar
Rabiner, L.R., Sambur, M.R.: An algorithm for determining the endpoints of isolated utterances. Bell System Technical Journal 54(2), 297–315 (1975)
Article Google Scholar
He, S.N., Yu, J.B.: A novel Chinese continuous speech endpoint detection method based on time domain features of the word structure. In: IEEE International Conference on Communications, Circuits and Systems and West Sino Expositions, vol. 2, pp. 992–996 (2002)
Google Scholar
Zhang, W.J., Xie, J.Y.: Endpoint detection based on MDL using subband speech satisfied auditory model. In: IEEE International Conference on Neural Networks and Signal Processing, vol. 2, pp. 892–895 (2003)
Google Scholar
Bou-Ghazale, S.E., Assaleh, K.: A robust endpoint detection of speech for noisy environments with application to automatic speech recognition. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, pp. 3808–3811 (2002)
Google Scholar
Wu, B.F., Wang, K.C.: Robust Endpoint Detection Algorithm Based on the Adaptive Band-Partitioning Spectral Entropy in Adverse Environments. IEEE Transactions on Speech and Audio Processing 13(5), 762–775 (2005)
Article Google Scholar
McIlraith, A.L., Card, H.C.: Bird song identification using artificial neural networks and statistical analysis. In: Canadian Conference on Electrical and Computer Engineering, vol. 1, pp. 63–66 (1997)
Google Scholar
Duda, R., Hart, P., Stork, D.: Pattern Classification. Wiley, New York (2000)
MATH Google Scholar
Tan, J.H.: On cluster validity for fuzzy clustering. Master Thesis, Applied Mathematics Department, Chung Yuan Christian University, Taiwan, R.O.C (2000)
Google Scholar
Kabaya, T., Matsuda, M.: The Songs & Calls of 420 Birds in Japan. SHOGAKUKAN Inc., Tokyo (2001)
Google Scholar
Hung, J.W., Tsai, W.Y.: Constructing Modulation Frequency Domain-Based Features for Robust Speech Recognition. IEEE Transactions on Audio, Speech, and Language Processing 16(3), 563–577 (2008)
Article MathSciNet Google Scholar
Minh, V.D., Lee, S.Y.: PCA-based human auditory filter bank for speech recognition. In: International Conference on Signal Processing and Communications, pp. 393–397 (2004)
Google Scholar
Takiguchi, T., Ariki, Y.: Robust Feature Extraction using Kernel PCA. In: IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. I509–I512 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, Chung Hua University, No. 707, Sec. 2, WuFu Rd., Hsinchu, 30067, Taiwan (R.O.C.)
Chih-Hsun Chou & Hui-Yu Ko

Authors

Chih-Hsun Chou
View author publications
You can also search for this author in PubMed Google Scholar
Hui-Yu Ko
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Information Engineering, Chung Hua University, 300, Hsinchu, Taiwan
Ching-Hsien Hsu
Department of Computer Science, St. Francis Xavier University, B2G 2W5, Antigonish, NS, Canada
Laurence T. Yang & Chunsheng Zhu &
Faculty of Computer and Information Sciences, Hosei University, 184-8584, Tokyo, Japan
Jianhua Ma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chou, CH., Ko, HY. (2011). Automatic Birdsong Recognition with MFCC Based Syllable Feature Extraction. In: Hsu, CH., Yang, L.T., Ma, J., Zhu, C. (eds) Ubiquitous Intelligence and Computing. UIC 2011. Lecture Notes in Computer Science, vol 6905. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23641-9_17

Download citation

DOI: https://doi.org/10.1007/978-3-642-23641-9_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23640-2
Online ISBN: 978-3-642-23641-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics