Speech Event Detection Using Support Vector Machines

Yélamos, P.; Ramírez, J.; Górriz, J. M.; Puntonet, C. G.; Segura, J. C.

doi:10.1007/11758501_50

P. Yélamos²⁰,
J. Ramírez²⁰,
J. M. Górriz²⁰,
C. G. Puntonet²¹ &
…
J. C. Segura²⁰

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3991))

Included in the following conference series:

International Conference on Computational Science

1136 Accesses
2 Citations

Abstract

An effective speech event detector is presented in this work for improving the performance of speech processing systems working in noisy environment. The proposed method is based on a trained support vector machine (SVM) that defines an optimized non-linear decision rule involving the subband SNRs of the input speech. It is analyzed the classification rule in the input space and the ability of the SVM model to learn how the signal is masked by the background noise. The algorithm also incorporates a noise reduction block working in tandem with the voice activity detector (VAD) that has shown to be very effective in high noise environments. The experimental analysis carried out on the Spanish SpeechDat-Car database shows clear improvements over standard VADs including ITU G.729, ETSI AMR and ETSI AFE for distributed speech recognition (DSR), and other recently reported VADs.

Download to read the full chapter text

Chapter PDF

A Hybrid Distance-Based Method and Support Vector Machines for Emotional Speech Detection

Voice Activity Detector (VAD) Based on Long-Term Mel Frequency Band Features

Line spectral frequency-based features and extreme learning machine for voice activity detection from audio signal

Article 18 June 2018

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Vapnik, V.: Estimation of Dependences Based on Empirical Data. Springer, New York (1982)
MATH Google Scholar
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, Berlin (1995)
MATH Google Scholar
Vapnik, V.: Statistical Learning Theory. John Wiley and Sons, Inc., New York (1998)
MATH Google Scholar
Enqing, D., Guizhong, L., Yatong, Z., Xiaodi, Z.: Applying support vector machines to voice activity detection. In: 6th International Conference on Signal Processing, vol. 2, pp. 1124–1127 (2002)
Google Scholar
ITU: A silence compression scheme for G.729 optimized for terminals conforming to recommendation V.70. ITU-T Recommendation G.729-Annex B (1996)
Google Scholar
Enqing, D., Heming, Z., Yongli, L.: Low bit and variable rate speech coding using local cosine transform. In: Proc. of the 2002 IEEE Region 10 Conference on Computers, Communications, Control and Power Engineering (TENCON 2002), vol. 1, pp. 423–426 (2002)
Google Scholar
Qi, F., Bao, C., Liu, Y.: A novel two-step SVM classifier for voiced/unvoiced/silence classification of speech. In: International Symposium on Chinese Spoken Language Processing, pp. 77–80 (2004)
Google Scholar
ETSI: Voice activity detector (VAD) for Adaptive Multi-Rate (AMR) speech traffic channels. ETSI EN 301 708 Recommendation (1999)
Google Scholar
ETSI: Speech processing, transmission and quality aspects (STQ); distributed speech recognition; advanced front-end feature extraction algorithm; compression algorithms. ETSI ES 201 108 Recommendation (2002)
Google Scholar
Sohn, J., Kim, N.S., Sung, W.: A statistical model-based voice activity detection. IEEE Signal Processing Letters 16, 1–3 (1999)
Article Google Scholar
Woo, K., Yang, T., Park, K., Lee, C.: Robust voice activity detection algorithm for estimating noise spectrum. Electronics Letters 36, 180–181 (2000)
Article Google Scholar
Li, Q., Zheng, J., Tsai, A., Zhou, Q.: Robust endpoint detection and energy normalization for real-time speech and speaker recognition. IEEE Transactions on Speech and Audio Processing 10, 146–157 (2002)
Article Google Scholar
Marzinzik, M., Kollmeier, B.: Speech pause detection for noise spectrum estimation by tracking power envelope dynamics. IEEE Transactions on Speech and Audio Processing 10, 341–351 (2002)
Article Google Scholar
Platt, J.: Fast Training of Support Vector Machines using Sequential Minimal Optimization. In: Advances in Kernel Methods - Support Vector Learning, pp. 185–208. MIT Press, Cambridge (1999)
Google Scholar
Clarkson, P., Moreno, P.: On the use of support vector machines for phonetic classification. In: Proc. of the IEEE Int. Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 585–588 (1999)
Google Scholar
Ganapathiraju, A., Hamaker, J., Picone, J.: Applications of support vector machines to speech recognition. IEEE Transactions on Signal Processing 52, 2348–2355 (2004)
Article Google Scholar
Chang, C., Lin, C.J.: LIBSVM: a library for support vector machines. Technical report, Dept. of Computer Science and Information Engineering, National Taiwan University (2001)
Google Scholar
Cortes, C., Vapnik, V.: Support-vector network. Machine Learning (1995)
Google Scholar
Moreno, A., Borge, L., Christoph, D., Gael, R., Khalid, C., Stephan, E., Jeffrey, A.: SpeechDat-Car: A Large Speech Database for Automotive Environments. In: Proceedings of the II LREC Conference (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

Dept. of Signal Theory, Networking and Communications, University of Granada, Spain
P. Yélamos, J. Ramírez, J. M. Górriz & J. C. Segura
Dept. of Architecture and Computer Technology, University of Granada, Spain
C. G. Puntonet

Authors

P. Yélamos
View author publications
You can also search for this author in PubMed Google Scholar
J. Ramírez
View author publications
You can also search for this author in PubMed Google Scholar
J. M. Górriz
View author publications
You can also search for this author in PubMed Google Scholar
C. G. Puntonet
View author publications
You can also search for this author in PubMed Google Scholar
J. C. Segura
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Advanced Computing and Emerging Technologies Centre, The School of Systems Engineering, University of Reading, RG6 6AY, Reading, United Kingdom
Vassil N. Alexandrov
Department of Mathematics and Computer Science, University of Amsterdam, Kruislaan 403, 1098, Amsterdam, SJ, The Netherlands
Geert Dick van Albada
Faculty of Sciences, Section of Computational Science, University of Amsterdam, Kruislaan 403, 1098, Amsterdam, SJ, The Netherlands
Peter M. A. Sloot
Computer Science Department, University of Tennessee, 37996-3450, Knoxville, TN, USA
Jack Dongarra

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yélamos, P., Ramírez, J., Górriz, J.M., Puntonet, C.G., Segura, J.C. (2006). Speech Event Detection Using Support Vector Machines. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds) Computational Science – ICCS 2006. ICCS 2006. Lecture Notes in Computer Science, vol 3991. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11758501_50

Download citation

DOI: https://doi.org/10.1007/11758501_50
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34379-0
Online ISBN: 978-3-540-34380-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Speech Event Detection Using Support Vector Machines

Abstract

Chapter PDF

Similar content being viewed by others

A Hybrid Distance-Based Method and Support Vector Machines for Emotional Speech Detection

Voice Activity Detector (VAD) Based on Long-Term Mel Frequency Band Features

Line spectral frequency-based features and extreme learning machine for voice activity detection from audio signal

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Speech Event Detection Using Support Vector Machines

Abstract

Chapter PDF

Similar content being viewed by others

A Hybrid Distance-Based Method and Support Vector Machines for Emotional Speech Detection

Voice Activity Detector (VAD) Based on Long-Term Mel Frequency Band Features

Line spectral frequency-based features and extreme learning machine for voice activity detection from audio signal

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation