Bio-inspired voice activity detector based on the human speech properties in the modulation domain

  • A. Shadevsky
  • A. Petrovsky
Conference paper


In many conventional voice activity detection (VAD) methods, a speech signal is assumed to be acquired in high quality. This paper describes a method of robust voice activity detection, which deals with speech signal in noise environment. The proposed VAD scheme explores the properties of modulation spectrum of human speech. Speech signal is split into frequency bands and filtered in the modulation frequency domain for noise level pre-reducing. Then, spectrum energy evaluation is performed and noise threshold is calculated. The proposed method provides robust speech detection in a varying noise environment. It can be used in speech enhancement and speech coding algorithms. Characteristics of the proposed method were investigated with different types of noisy speech.


voice activity detector modulation spectrum 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

6 References

  1. [1]
    Sovka P., Polak P. The Study of Speech/Pause Detectors for Speech Enhancement Methods // Proceeding of the 4th European Conference on Speech Communication and Technology, Madrid, Spain, September 1995, pp. 1575–1578.Google Scholar
  2. [2]
    Borowicz A., Petrovsky A. The Comparative Study of Voice Activity Detectors // Иэвестия Белорусской инженерной академии, №2(14)/l, 2002, с. 148–152.Google Scholar
  3. [3]
    Puder H., Soffke O. An Approach to an Optimized Voice-Activity Detector for Noisy Speech Signals // Proceeding of the XI European Signal Processing Conference, Toulouse, France, 03–06 September 2002, Vol I, pp 243–246.Google Scholar
  4. [4]
    Hioka Y., Hamada N. Voice Activity Detection with Array Signal Processing in the Wavelet Domain // Proceeding of the XI European Signal Processing Conference, Toulouse, France, 03–06 September 2002, Vol. I, pp 255–258.Google Scholar
  5. [5]
    Rosca J., Balan R., Fan N.P. and e.t. Multichannel Voice Detection in Adverse Environments // Proceeding of the XI European Signal Processing Conference, Toulouse, France, 03–06 September 2002, Vol. I, pp 251–254Google Scholar
  6. [6]
    Special Issue “Neuromorphic Signal Processing and Implementations” edited by S.A. Shamma and A. Schaik “, EURASIP Journal on Applied Signal Processing, 7 (2003), June (2003).Google Scholar
  7. [7]
    Elhilali M., Chi T., Shamma S. A Spectro-temporal modulation index (STMI) for assessment of speech intelligibility // Speech Communication — 2003 — 41 — pp. 331–348.CrossRefGoogle Scholar
  8. [8]
    Hermansky H., Morgan N. “RASTA processing of speech”, IEEE Transactions on speech and audio processing 4(2) October (1994), pp. 578–589.CrossRefGoogle Scholar
  9. [9]
    Drullman R., Festen J.M., Plomp R. Effect of temporal envelope smearing on speech reception // J. Acoust. Soc. Am. — 1994 — №2(95) — pp 1053–1064.CrossRefGoogle Scholar
  10. [10]
    Arai T., Pavel M., Hermansky H., Avendano C. Syllable intelligibility for temporally filtered LPC cepstral trajectories // J. Acoust. Soc, Am. — 1999 — vol. 105 — pp 2783–2791.CrossRefGoogle Scholar
  11. [11]
    Kusumoto A., Arai T., Kitamura T., Takahashi M., Murahara Y., ‘Modulation enhancement of speech as a preprocessing for reverberant chambers with the hearing-impaired // Proc. of the ICASSP, Vol. 2, pp. 853–856, Istanbul, 2000.Google Scholar
  12. [12]
    Avendano C, Temporal processing of speech in a Time-Feature Space, Ph.D. thesis, Oregon Graduate Institute, Portland, OR, Apr., 1997.Google Scholar
  13. [13]
    Shadevsky A., Baszun J., Petrovsky A. Noise reduction based on neuromorphic speech signal processing. — Structures-waves-human health: acoustical engineering. Editor R. Panuszka. — vol. XIII, No 1, Krakow 2004. — pp. 115–122Google Scholar
  14. [14]
    Shadevsky A., Petrovsky A. Voice activity detector based on human speech modulatuion spectrum exploitation // Proceeding of the 6th International Conference and Exhibition “Digital Signal Processing and its Applications”, Moscow, Russia, 31 March–2 April 2004, Vol. I, pp. 167–180, in russian.Google Scholar

Copyright information

© Springer Science+Business Media, Inc. 2005

Authors and Affiliations

  • A. Shadevsky
    • 1
  • A. Petrovsky
    • 1
  1. 1.Computer Engineering DepartmentBialystok Technical University, Belarusian State University of Informatics and RadioelectronicsMinskBelarus

Personalised recommendations