Abstract
In this paper we suggest to apply a new feature, called Minimum Energy Density (MED), in discrimination of audio signals between speech and music. Our method is based on the analysis of local energy for 1 or 2.5 seconds audio signals. An elementary analysis of the probability for the power distribution is an effective tool supporting the decision making system. We compare our feature with Percentage of Low Energy Frames (LEF), Modified Low Energy Ratio (MLER) and examine their efficiency for two separate speech/music corpora.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cabañas Molero, P., Ruiz Reyes, N., Vera Candeas, P., Maldonado Bascon, S.: Low-complexity f0-based speech/nonspeech discrimination approach for digital hearing aids. Multimedia Tools and Applications 54, 291–319 (2011)
Carey, M., Parris, E., Lloyd-Thomas, H.: A comparison of features for speech, music discrimination. In: Proceedings of 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 149–152 (March 1999)
Didiot, E., Illina, I., Fohr, D., Mella, O.: A wavelet-based parameterization for speech/music discrimination. Comput. Speech Lang. 24(2), 341–357 (2010)
Jones, R.C.: Electronic device for automatically discriminating between speech and music forms. US Patent 2761897 (1956)
Lu, L., Jiang, H., Zhang, H.: A robust audio classification and segmentation method. In: Proceedings of the Ninth ACM International Conference on Multimedia, MULTIMEDIA 2001, pp. 203–211. ACM, New York (2001)
Masior, M., Ziółko, M., Kacprzak, S.: Multi-lingual speech samples base, http://speechsamples.agh.edu.pl/
Okamura, S., Aoyama, K.: An experimental study of energy dips for speech and music. Pattern Recognition 16(2), 163–166 (1983)
Saunders, J.: Real-time discrimination of broadcast speech/music. In: Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1996, vol. 2, pp. 993–996 (May 1996)
Scheirer, E., Slaney, M.: Construction and evaluation of a robust multifeature speech/music discriminator. In: 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1997, vol. 2, pp. 1331–1334 (April 1997)
Wang, W., Gao, W., Ying, D.: A fast and robust speech/music discrimination approach. In: Proceedings of the 2003 Joint Conference of the Fourth International Conference on Information, Communications and Signal Processing, and Fourth Pacific Rim Conference on Multimedia, vol. 3, pp. 1325–1329 (December 2003)
Wei, Z., Ranran, D., Minhui, P., Qiuhong, W.: Automatic speech corpus construction from broadcasting speech databases. In: 2010 International Conference on Computational Intelligence and Security (CIS), pp. 639–643 (December 2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kacprzak, S., Ziółko, M. (2013). Speech/Music Discrimination via Energy Density Analysis. In: Dediu, AH., Martín-Vide, C., Mitkov, R., Truthe, B. (eds) Statistical Language and Speech Processing. SLSP 2013. Lecture Notes in Computer Science(), vol 7978. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39593-2_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-39593-2_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39592-5
Online ISBN: 978-3-642-39593-2
eBook Packages: Computer ScienceComputer Science (R0)