Bio-inspired voice activity detector based on the human speech properties in the modulation domain
In many conventional voice activity detection (VAD) methods, a speech signal is assumed to be acquired in high quality. This paper describes a method of robust voice activity detection, which deals with speech signal in noise environment. The proposed VAD scheme explores the properties of modulation spectrum of human speech. Speech signal is split into frequency bands and filtered in the modulation frequency domain for noise level pre-reducing. Then, spectrum energy evaluation is performed and noise threshold is calculated. The proposed method provides robust speech detection in a varying noise environment. It can be used in speech enhancement and speech coding algorithms. Characteristics of the proposed method were investigated with different types of noisy speech.
Keywordsvoice activity detector modulation spectrum
Unable to display preview. Download preview PDF.
- Sovka P., Polak P. The Study of Speech/Pause Detectors for Speech Enhancement Methods // Proceeding of the 4th European Conference on Speech Communication and Technology, Madrid, Spain, September 1995, pp. 1575–1578.Google Scholar
- Borowicz A., Petrovsky A. The Comparative Study of Voice Activity Detectors // Иэвестия Белорусской инженерной академии, №2(14)/l, 2002, с. 148–152.Google Scholar
- Puder H., Soffke O. An Approach to an Optimized Voice-Activity Detector for Noisy Speech Signals // Proceeding of the XI European Signal Processing Conference, Toulouse, France, 03–06 September 2002, Vol I, pp 243–246.Google Scholar
- Hioka Y., Hamada N. Voice Activity Detection with Array Signal Processing in the Wavelet Domain // Proceeding of the XI European Signal Processing Conference, Toulouse, France, 03–06 September 2002, Vol. I, pp 255–258.Google Scholar
- Rosca J., Balan R., Fan N.P. and e.t. Multichannel Voice Detection in Adverse Environments // Proceeding of the XI European Signal Processing Conference, Toulouse, France, 03–06 September 2002, Vol. I, pp 251–254Google Scholar
- Special Issue “Neuromorphic Signal Processing and Implementations” edited by S.A. Shamma and A. Schaik “, EURASIP Journal on Applied Signal Processing, 7 (2003), June (2003).Google Scholar
- Kusumoto A., Arai T., Kitamura T., Takahashi M., Murahara Y., ‘Modulation enhancement of speech as a preprocessing for reverberant chambers with the hearing-impaired // Proc. of the ICASSP, Vol. 2, pp. 853–856, Istanbul, 2000.Google Scholar
- Avendano C, Temporal processing of speech in a Time-Feature Space, Ph.D. thesis, Oregon Graduate Institute, Portland, OR, Apr., 1997.Google Scholar
- Shadevsky A., Baszun J., Petrovsky A. Noise reduction based on neuromorphic speech signal processing. — Structures-waves-human health: acoustical engineering. Editor R. Panuszka. — vol. XIII, No 1, Krakow 2004. — pp. 115–122Google Scholar
- Shadevsky A., Petrovsky A. Voice activity detector based on human speech modulatuion spectrum exploitation // Proceeding of the 6th International Conference and Exhibition “Digital Signal Processing and its Applications”, Moscow, Russia, 31 March–2 April 2004, Vol. I, pp. 167–180, in russian.Google Scholar