Skip to main content

Automatic Birdsong Recognition with MFCC Based Syllable Feature Extraction

  • Conference paper
Ubiquitous Intelligence and Computing (UIC 2011)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6905))

Included in the following conference series:

Abstract

In this study, an automatic birdsong recognition system based on syllable features was developed. In this system, after syllable segmentation, three syllable features, namely mean, QI and QE, were computed from the MFCCs of each syllable aims at capturing variations in time as well as amplitude transitions of the MFCC sequences. With the advantages of the fuzzy c-mean (FCM) clustering algorithm and the linear discriminant analysis (LDA), the presented feature vector was used to construct an automatic birdsong recognition system applied to a birdsong database with 420 bird species.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Lee, S.M., Fang, S.H., Hung, J.W., Lee, L.S.: Improved MFCC feature extraction by PCA-optimized filter-bank for speech recognition. In: IEEE Workshop, Automatic Speech Recognition and Understanding, pp. 49–52 (2001)

    Google Scholar 

  2. Lee, C.H., Hyun, D.H., Choi, E.S., Go, J.W., Lee, C.Y.: Optimizing feature extraction for speech recognition. IEEE Transactions on Speech and Audio Processing 11, 80–87 (2003)

    Article  Google Scholar 

  3. Skowronski, M.D., Harris, J.G.: Increased MFCC filter bandwidth for noise-robust phoneme recognition. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 801–804 (2002)

    Google Scholar 

  4. Skowronski, M.D., Harris, J.G.: Improving the filter bank of a classic speech feature extraction algorithm. Circuits and Systems 4, 281–284 (2003)

    Google Scholar 

  5. Bou-Ghazale, S.E., Hansen, J.H.L.: A comparative study of traditional and newly proposed features for recognition of speech under stress. IEEE Transactions on Speech and Audio Processing 8, 429–442 (2000)

    Article  Google Scholar 

  6. Ricotti, L.P.: Multitapering and a wavelet variant of MFCC in speech recognition. In: IEE Proceedings - Vision, Image and Signal Processing, pp. 29–35 (February 2005)

    Google Scholar 

  7. Hung, W.W., Wang, H.C.: On the use of weighted filter bank analysis for the derivation of robust MFCCs. IEEE Signal Processing Letters 8, 70–73 (2001)

    Article  Google Scholar 

  8. Kwan, C., et al.: An automated acoustic system to monitor and classify birds. EURASIP Journal on Applied Signal Processing, Article ID 96706, 1–19 (2006)

    Google Scholar 

  9. Somervuo, P., Harma, A., Fagerlund, S.: Parametric Representations of Bird Sounds for Automatic Species Recognition. IEEE Transactions on Audio, Speech and Language Processing 14, 2252–2263 (2006)

    Article  Google Scholar 

  10. Rabiner, L.R., Sambur, M.R.: An algorithm for determining the endpoints of isolated utterances. Bell System Technical Journal 54(2), 297–315 (1975)

    Article  Google Scholar 

  11. He, S.N., Yu, J.B.: A novel Chinese continuous speech endpoint detection method based on time domain features of the word structure. In: IEEE International Conference on Communications, Circuits and Systems and West Sino Expositions, vol. 2, pp. 992–996 (2002)

    Google Scholar 

  12. Zhang, W.J., Xie, J.Y.: Endpoint detection based on MDL using subband speech satisfied auditory model. In: IEEE International Conference on Neural Networks and Signal Processing, vol. 2, pp. 892–895 (2003)

    Google Scholar 

  13. Bou-Ghazale, S.E., Assaleh, K.: A robust endpoint detection of speech for noisy environments with application to automatic speech recognition. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, pp. 3808–3811 (2002)

    Google Scholar 

  14. Wu, B.F., Wang, K.C.: Robust Endpoint Detection Algorithm Based on the Adaptive Band-Partitioning Spectral Entropy in Adverse Environments. IEEE Transactions on Speech and Audio Processing 13(5), 762–775 (2005)

    Article  Google Scholar 

  15. McIlraith, A.L., Card, H.C.: Bird song identification using artificial neural networks and statistical analysis. In: Canadian Conference on Electrical and Computer Engineering, vol. 1, pp. 63–66 (1997)

    Google Scholar 

  16. Duda, R., Hart, P., Stork, D.: Pattern Classification. Wiley, New York (2000)

    MATH  Google Scholar 

  17. Tan, J.H.: On cluster validity for fuzzy clustering. Master Thesis, Applied Mathematics Department, Chung Yuan Christian University, Taiwan, R.O.C (2000)

    Google Scholar 

  18. Kabaya, T., Matsuda, M.: The Songs & Calls of 420 Birds in Japan. SHOGAKUKAN Inc., Tokyo (2001)

    Google Scholar 

  19. Hung, J.W., Tsai, W.Y.: Constructing Modulation Frequency Domain-Based Features for Robust Speech Recognition. IEEE Transactions on Audio, Speech, and Language Processing 16(3), 563–577 (2008)

    Article  MathSciNet  Google Scholar 

  20. Minh, V.D., Lee, S.Y.: PCA-based human auditory filter bank for speech recognition. In: International Conference on Signal Processing and Communications, pp. 393–397 (2004)

    Google Scholar 

  21. Takiguchi, T., Ariki, Y.: Robust Feature Extraction using Kernel PCA. In: IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. I509–I512 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chou, CH., Ko, HY. (2011). Automatic Birdsong Recognition with MFCC Based Syllable Feature Extraction. In: Hsu, CH., Yang, L.T., Ma, J., Zhu, C. (eds) Ubiquitous Intelligence and Computing. UIC 2011. Lecture Notes in Computer Science, vol 6905. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23641-9_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-23641-9_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-23640-2

  • Online ISBN: 978-3-642-23641-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics