The effect of different acoustic noise on speech signal formant frequency location
The presence of noise is one of the major challenges and concerns in speech recognition systems. There are in particular different kinds of noises (pink, white and leopard) that can adversely affect a speech signal in various ways and degrees. In this study, the extent of resistance of a speech signal’s formants or in other words, the displacement of the formants have been measured against being subjected to different conventional noises. The methodology adopted was to apply different noises to the original voice signal, then to measure and to investigate the amount of formant location displacement. In this paper, the mean square movement (MSM) parameter has been introduced. This represents the deviation and displacement amount of the frequencies of the formants caused by applying the various noises. All of the investigations were conducted under three different SNR conditions (5, 10 and 15 dB). This allowed for the assessment of the influence of the signal-to-noise ratio (SNR) on the MSM parameter and the extent of the displacements of the formants. The results indicate that the frequency of the formants under these three SNR amounts was resistant against the machine gun type of noise, whilst white noise caused the most measureable effect and displacement in the frequencies of the formants.
KeywordsFormant Automatic speech recognition (ASR) Acoustic Linear predictive coding (LPC) Hidden Markov model Auto regressive (AR) Mean square movement (MSM) Vocal tract
- Acero, A. (1999). Formant analysis and synthesis using hidden Markov models. Sixth European Conference on Speech Communication and Technology. Retrieved July 19, 2018 from https://www.microsoft.com/en-us/research/wp-content/uploads/1999/09/1999-alexac-eurospeech.pdf.
- Duan, Z., Mysore, G. J., & Smaragdis, P. (2012). Speech enhancement by online non-negative spectrogram decomposition in nonstationary noise environments. Thirteenth Annual Conference of the International Speech Communication Association. Retrieved, July 19, 2018 from https://ccrma.stanford.edu/~gautham/Site/Publications_files/duan-interspeech2012.pdf.
- Gargouri, D., Kammoun, M. A., & Hamida, A. B. (2006, May). A comparative study of formant frequencies estimation techniques. Proceedings of the 5th WSEAS International Conference on Signal Processing, Istanbul, Turkey (pp. 15–19).Google Scholar
- Signal Processing Information Base (2013, July 21). Retrieved March 20, 2017, from http://spib.linse.ufsc.br/noise.html
- Teacher, C., & Watkins, H. (1978). ANDVT microphone and audio system study. Ketron final report. Washington, DC: Ketron, Inc.Google Scholar
- Weber, K., Bengio, S., & Bourlard, H. (2001). Hmm2-extraction of formant features and their use for robust ASR. European Conference on Speech Communication and Technology (Eurospeech 2001) (No. EPFL-CONF-82693, pp. 607–610).Google Scholar