Abstract
In this study, an automatic birdsong recognition system based on syllable features was developed. In this system, after syllable segmentation, three syllable features, namely mean, QI and QE, were computed from the MFCCs of each syllable aims at capturing variations in time as well as amplitude transitions of the MFCC sequences. With the advantages of the fuzzy c-mean (FCM) clustering algorithm and the linear discriminant analysis (LDA), the presented feature vector was used to construct an automatic birdsong recognition system applied to a birdsong database with 420 bird species.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Lee, S.M., Fang, S.H., Hung, J.W., Lee, L.S.: Improved MFCC feature extraction by PCA-optimized filter-bank for speech recognition. In: IEEE Workshop, Automatic Speech Recognition and Understanding, pp. 49–52 (2001)
Lee, C.H., Hyun, D.H., Choi, E.S., Go, J.W., Lee, C.Y.: Optimizing feature extraction for speech recognition. IEEE Transactions on Speech and Audio Processing 11, 80–87 (2003)
Skowronski, M.D., Harris, J.G.: Increased MFCC filter bandwidth for noise-robust phoneme recognition. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 801–804 (2002)
Skowronski, M.D., Harris, J.G.: Improving the filter bank of a classic speech feature extraction algorithm. Circuits and Systems 4, 281–284 (2003)
Bou-Ghazale, S.E., Hansen, J.H.L.: A comparative study of traditional and newly proposed features for recognition of speech under stress. IEEE Transactions on Speech and Audio Processing 8, 429–442 (2000)
Ricotti, L.P.: Multitapering and a wavelet variant of MFCC in speech recognition. In: IEE Proceedings - Vision, Image and Signal Processing, pp. 29–35 (February 2005)
Hung, W.W., Wang, H.C.: On the use of weighted filter bank analysis for the derivation of robust MFCCs. IEEE Signal Processing Letters 8, 70–73 (2001)
Kwan, C., et al.: An automated acoustic system to monitor and classify birds. EURASIP Journal on Applied Signal Processing, Article ID 96706, 1–19 (2006)
Somervuo, P., Harma, A., Fagerlund, S.: Parametric Representations of Bird Sounds for Automatic Species Recognition. IEEE Transactions on Audio, Speech and Language Processing 14, 2252–2263 (2006)
Rabiner, L.R., Sambur, M.R.: An algorithm for determining the endpoints of isolated utterances. Bell System Technical Journal 54(2), 297–315 (1975)
He, S.N., Yu, J.B.: A novel Chinese continuous speech endpoint detection method based on time domain features of the word structure. In: IEEE International Conference on Communications, Circuits and Systems and West Sino Expositions, vol. 2, pp. 992–996 (2002)
Zhang, W.J., Xie, J.Y.: Endpoint detection based on MDL using subband speech satisfied auditory model. In: IEEE International Conference on Neural Networks and Signal Processing, vol. 2, pp. 892–895 (2003)
Bou-Ghazale, S.E., Assaleh, K.: A robust endpoint detection of speech for noisy environments with application to automatic speech recognition. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, pp. 3808–3811 (2002)
Wu, B.F., Wang, K.C.: Robust Endpoint Detection Algorithm Based on the Adaptive Band-Partitioning Spectral Entropy in Adverse Environments. IEEE Transactions on Speech and Audio Processing 13(5), 762–775 (2005)
McIlraith, A.L., Card, H.C.: Bird song identification using artificial neural networks and statistical analysis. In: Canadian Conference on Electrical and Computer Engineering, vol. 1, pp. 63–66 (1997)
Duda, R., Hart, P., Stork, D.: Pattern Classification. Wiley, New York (2000)
Tan, J.H.: On cluster validity for fuzzy clustering. Master Thesis, Applied Mathematics Department, Chung Yuan Christian University, Taiwan, R.O.C (2000)
Kabaya, T., Matsuda, M.: The Songs & Calls of 420 Birds in Japan. SHOGAKUKAN Inc., Tokyo (2001)
Hung, J.W., Tsai, W.Y.: Constructing Modulation Frequency Domain-Based Features for Robust Speech Recognition. IEEE Transactions on Audio, Speech, and Language Processing 16(3), 563–577 (2008)
Minh, V.D., Lee, S.Y.: PCA-based human auditory filter bank for speech recognition. In: International Conference on Signal Processing and Communications, pp. 393–397 (2004)
Takiguchi, T., Ariki, Y.: Robust Feature Extraction using Kernel PCA. In: IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. I509–I512 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chou, CH., Ko, HY. (2011). Automatic Birdsong Recognition with MFCC Based Syllable Feature Extraction. In: Hsu, CH., Yang, L.T., Ma, J., Zhu, C. (eds) Ubiquitous Intelligence and Computing. UIC 2011. Lecture Notes in Computer Science, vol 6905. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23641-9_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-23641-9_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23640-2
Online ISBN: 978-3-642-23641-9
eBook Packages: Computer ScienceComputer Science (R0)