Abstract
In this paper we explore the use of non-linear transformations in order to improve the performance of an entropy based voice activity detector (VAD). The idea of using a non-linear transformation comes from some previous work done in speech linear prediction (LPC) field based in source separation techniques, where the score function was added into the classical equations in order to take into account the real distribution of the signal. We explore the possibility of estimating the entropy of frames after calculating its score function, instead of using original frames. We observe that if signal is clean, estimated entropy is essentially the same; but if signal is noisy transformed frames (with score function) are able to give different entropy if the frame is voiced against unvoiced ones. Experimental results show that this fact permits to detect voice activity under high noise, where simple entropy method fails.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ying, G.S., Mitchell, C.D., Jamieson, L.H.: Endpoint Detection of Isolated Utterances Based on a Modified Teager Energy Measurement. In: Proc. ICASSP II, pp. 732–735 (1993)
Shen, J.-l., Hung, J.-w., Lee, L.-s.: Robust Entropy-based Endpoint Detection for Speech Recognition in Noisy Environments. In: Proc. ICSLP CD-ROM 1998 (1998)
Shin, W.-H., Lee, B.-S., Lee, Y.-K., Lee, J.-S.: Speech/Non-Speech Classification Using Multiple Features For Robust Endpoint Detection. In: Proc. ICASSP, pp. 1399–1402 (2000)
Jia, C., Xu, B.: An improved Entropy-based endpoint detection algorithm. In: Proc. ICSLP (2002)
Shin, W.-H., Lee, B.-S., Lee, Y.-K., Lee, J.-S.: Speech/non-speech classification using multiple features for robust endpoint detection. In: Proc. ICASSP (2000)
Van Gerven, S., Xie, F.: A Comparative study of speech detection methods. In: European Conference on Speech, Communication and Techonlogy (1997)
Hariharan, R., Häkkinen, J., Laurila, K.: Robust end-of-utterance detection for real-time speech recognition applications. In: Proc. ICASSP (2001)
Acero, A., Crespo, C., De la Torre, C., Torrecilla, J.: Robust HMM-based endpoint detector. In: Proc. ICASSP (1994)
Kosmides, E., Dermatas, E., Kokkinakis, G.: Stochastic endpoint detection in noisy speech. In: SPECOM Workshop, pp. 109–114 (1997)
Shen, J., Hung, J., Lee, L.: Robust entropybased endpoint detection for speech recognition in noisy environments. In: Proc. ICSLP, Sydney (1998)
Shannon, C.E.: A mathematical theory of communication. Bell System Technical Journal 27, 379–423, 623–656 (1948)
Hyvärinen, A., Karhunen, J., Oja, E.: Independent Component Analysis. John Wiley & Sons, Chichester (2001)
Solé-Casals, J., Taleb, A., Jutten, C.: Parametric Approach to Blind Deconvolution of Nonlinear Channels. Neurocomputing 48, 339–355 (2002)
Solé-Casals, J., Monte, E., Taleb, A., Jutten, C.: Source separation techniques applied to speech linear prediction. In: Proc. ICSLP (2000)
Härdle, W.: Smoothing Techniques with implementation in S. Springer, Heidelberg (1990)
ETSI standard doc., ETSI ES 201 108 V1.1.3 (2003-2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Solé-Casals, J., Martí-Puig, P., Reig-Bolaño, R., Zaiats, V. (2010). Score Function for Voice Activity Detection. In: Solé-Casals, J., Zaiats, V. (eds) Advances in Nonlinear Speech Processing. NOLISP 2009. Lecture Notes in Computer Science(), vol 5933. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11509-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-11509-7_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11508-0
Online ISBN: 978-3-642-11509-7
eBook Packages: Computer ScienceComputer Science (R0)