Skip to main content

An Efficient Approach for Classification of Speech and Music

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5353))

Abstract

A new method to classify an audio segment into speech and music related to the automatic transcription of broadcast news is presented. To discriminate between speech and music, sample entropy (SampEn), a time complexity measure, mainly operates as a feature. SampEn is a variant of the approximate entropy (ApEn) that measures the regularity of time series. The basic idea is to label a given audio into speech or music depending on its regularity. Based on the SampEn sequence calculated over a window, the regularity of a given audio stream is measured. The effectiveness of the proposed method is tested on experiments, including broadcast news shows from BBC radio stations, WBAI news, UN news and music genres with different temporal distributions. Results show the robustness of the proposed method achieving high discrimination accuracy for all tested experiments.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ajmera, J., McCowan, I., Bourlard, H.: Speech/music segmentation using entropy and dynamism features in a HMM classification framework. Speech Communication 40, 351–363 (2003)

    Article  Google Scholar 

  2. Harb, H., Chen, L.: Robust speech and music discrimiantion using spectrum’s first order statisitcs and neural networks. In: Proc. IEEE Int., Symp. on Signal Processing and Its Applications, vol. 2, pp. 125–128 (2003)

    Google Scholar 

  3. Lake, E., Richman, S., Pamela Griffin, M., Randall Moorman, J.: Sample entropy analysis of neonatal heart rate variability. Am. J. Physiol. Regul. Integr. Comp. Physiol. 283, R789–R797 (2002)

    Article  Google Scholar 

  4. Munoz-Exposito, J.E., et al.: Speech/Music discrimination using a single Warped LPC-based feature. In: Proc. of ISMIR, London, UK, pp. 614–617 (2005)

    Google Scholar 

  5. Panagiotakis, C., Tziritas, G.: A Speech/Music Discriminator Based on RMS and Zero-Crossings. IEEE Transactions on MultiMedia 7(1), 155–166 (2005)

    Article  Google Scholar 

  6. Ngan, P.M.: Motion Detection using Approximate Entropy. DICTA, 379–384 (February 1997)

    Google Scholar 

  7. Pikrakis, A., Giannakopoulos, T., Theodoridis, S.: A computationally efficient speech/music discriminator for radio recordings. In: Proc. ISMIR 2006, Victoria, Canada, pp. 107–110 (2006)

    Google Scholar 

  8. Pikrakis, A., Giannakopoulos, T., Theodoridis, S.: Speech/Music Discrimination for radio broadcasts using a hybrid HMM-Bayseian Network architecture. In: Proc. EUSIPCO 2006, Florence, Italy, September 4-8 (2006)

    Google Scholar 

  9. Pincus, S., Singer, B.H.: Randomness and degrees of irregularity. Proc. Natl. Acad. Sci. USA 93, 2083–2088 (1995)

    Article  MathSciNet  MATH  Google Scholar 

  10. Pincus, S.M.: Approximate Entropy as a measure of system complexity. Proc. Natl. Acad. Sci. USA 88, 2297–2301 (1991)

    Article  MathSciNet  MATH  Google Scholar 

  11. Pwint, M., Sattar, F.: A Segmentation method for noisy speech using gentic algorithm. In: IEEE International Conference on Acoustic Speech and Signal ICASSP, pp. 521–524 (March 2005)

    Google Scholar 

  12. Richman, J.S., Moorman, J.R.: Physilogical time series analysis using approximate entropy and sample entropy. Am. J. Physiol. Heart. Circ. Physiol. 278, H2039–H2049 (2000)

    Google Scholar 

  13. Scheirer, E., Slaney, M.: COnstruction and evaluation of a robust multifeature speec/music discrimiantion. In: Proc. IEEE ICASSP 1997, pp. 1331–1334 (1997)

    Google Scholar 

  14. Zhang, T., Jay Kuo, C.C.: Audio Content Analysis for Online Audiovisual Data Segmentation and Classification. IEEE Transactions on Speech and Audio Processing 9(4), 441–457 (2001)

    Article  Google Scholar 

  15. Lu, L., Zhang, H.-J., Jiang, H.: Content Analysis for Audio Classification and Segmentation. IEEE Transactions on Speech and Audio Processing 10(7) (October 2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Swe, E.M.M., Pwint, M. (2008). An Efficient Approach for Classification of Speech and Music. In: Huang, YM.R., et al. Advances in Multimedia Information Processing - PCM 2008. PCM 2008. Lecture Notes in Computer Science, vol 5353. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89796-5_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-89796-5_6

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-89795-8

  • Online ISBN: 978-3-540-89796-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics