Hybrid Feature Extraction Techniques Using TEO-PWP for Enhancement of Automatic Speech Recognition in Real Noisy Environment

Helali, Wafa; Hajaiej, Zied; Cherif, Adnen

doi:10.1007/978-3-030-22964-1_20

Wafa Helali⁶,
Zied Hajaiej⁶ &
Adnen Cherif⁶

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 150))

Included in the following conference series:

International Conference on Smart Innovation, Ergonomics and Applied Human Factors

607 Accesses

Abstract

In recent years, most research areas have focused their attention on the exactitude of Speech Recognition (SR). Despite being reasonably performant in quiet conditions, these systems are indeed far too ineffective in distorted conditions or malformed channels. Given these observations, finding functional feature extraction methods capable of improving the capacities of those systems in non-optimal conditions is more than an indispensable requirement. The present paper presents an investigation that was carried out on those Speech Recognition systems in noisy conditions, with many combinations of new three hybrid feature extraction algorithms such as Teager-Energy Operator-Perceptual Wavelet Packet (TEO-PWP), Mel Cepstrum Coefficient (MFCC) and Perceptual Linear Production (PLP). A (HMM) was also used to classify the extracted features. Our model was tested on TIMIT database that contains both clean and noisy speech files recorded at different level of Speech-to-Noise Ratio (SNR). The analytic bases for speech processing and classification procedures were exhibited and the recognition results were given depending on speech recognition rates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Lei, S.F., Tung, Y.K.: Speech enhancement for nonstationary noises by wavelet packet transform and adaptive noise estimation. In: Intelligent Signal Processing and Communication Systems (ISPACS 2005), Proceedings of 2005 International Symposium on IEEE, pp. 41–44 (2005)
Google Scholar
Firouzeh, F.F., Ghorshi, S., Salsabili, S.: Compressed sensing based speech enhancement. In: 2014 8th International Conference on Signal Processing and Communication Systems (ICSPCS). IEEE, pp. 1–6 (2014)
Google Scholar
Donoho, D.: De-noising by soft-thresholding. IEEE Trans. Inf. Theory 41, 613–627 (1995)
Article MathSciNet Google Scholar
Wu, D., Zhu, W.-P., Swamy, M.: The theory of compressive sensing matching pursuit considering time-domain noise with application to speech enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 22(3), 682–696 (2014)
Article Google Scholar
Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans Acoust. Speech Signal Process. 28(4), 357–366 (1980)
Article Google Scholar
Hermansky, H.: Perceptual linear prediction analysis of speech. J. Acoust. Soc. Am. 87(4), 1738–1752 (1990)
Article Google Scholar
Zhu, Q., Alwan, A.: On the Use of variable frame rate analysis in speech recognition. In: 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 3, pp. 1783–1786 (2000)
Google Scholar
Rabiner, L.R., Juang, B.-H.: Fundamentals of Speech Recognition, vol. 14. PTR Prentice Hall, EnglewoodCliffs (1993)
Google Scholar
Chetouani, M., Gas, B., Zarader, J.: Discriminative training for neural predictive coding applied to speech features extraction. In: Proceedings of the 2002 International Joint Conference on Neural Networks, vol. 1, pp. 852–857 (2002)
Google Scholar
Hermansky, H.: Perceptual Linear Predictive (PLP) analysis of speech. J. Acoust. Soc. Am. 87, 1738–1752 (1990)
Article Google Scholar
Hermansky, H., Morgan, N., Bayya, A., Kohn, P.: The Challenge of Inverse-E: The RASTA-PLP Method. In: 1991 Conference Record of the 25th Asilomar Conference on Signals, Systems and Computers, Pacific Grove, 4–6 Nov 1991, pp. 800–804 (1991)
Google Scholar
Patterson, R.D., Moore, B.C.J.: Frequency Selective in Hearing, Chapter Auditory Filters and Excitation Patterns as Representations of Frequency Resolution, pp. 123–177. Academic Press Ltd., London (1986)
Google Scholar
Kim, G., Lu, Y., Hu, Y., Loizou, P.C.: An algorithm that improves speech intelligibility in noise for normal-hearing listeners. J. Acoust. Soc. Am. 126, 1486–1494 (2009)
Article Google Scholar
Islam, M.T., Shahnaz, C., Zhu, W.-P., Ahmad, M.O.: Speech enhancement based on student modeling of teager energy operated perceptual wavelet packet coefficients and a custom thresholding function. IEEE ACM Trans. Audio Speech Lang. Process. 23(11), 1800–1811 (2015)
Article Google Scholar
Sanam, T., Shahnaz, C.: Enhancement of noisy speech based on a custom thresholding function with a statistically determined threshold. Int. J. Speech Technol. 15, 463–475 (2012)
Article Google Scholar
Sanam, T.F., Shahnaz, C.: A combination of semisoft and -law thresholding functions for enhancing noisy speech in wavelet packet domain. In: 2012 7th International Conference on Electrical & Computer Engineering (ICECE), pp. 884–887. IEEE (2012)
Google Scholar
Islam, M.T., Shahnaz, C., Zhu, W.-P., Ahmad, M.O.: Modeling of teager energy operated perceptual wavelet packet coefficients with an Erlang-2 PDF for real time enhancement of noisy speech. Prepr. Submitt. J. LATEX Templates (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Research Unite of Processing and Analysis of Electrical and Energetic Systems, Faculty of Sciences of Tunis, University Tunis El-Manar, 2092, Tunis, Tunisia
Wafa Helali, Zied Hajaiej & Adnen Cherif

Authors

Wafa Helali
View author publications
You can also search for this author in PubMed Google Scholar
Zied Hajaiej
View author publications
You can also search for this author in PubMed Google Scholar
Adnen Cherif
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wafa Helali .

Editor information

Editors and Affiliations

Escuela Técnica Superior de Ingeniería y Sistemas de Telecomunicación, Universidad Politécnica de Madrid, Madrid, Spain
César Benavente-Peces
Jeddah Community College, King Abdulaziz University, Jeddah, Saudi Arabia
Sami Ben Slama
Department of Information Systems, King Abdulaziz University, Jeddah, Saudi Arabia
Bassam Zafar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Helali, W., Hajaiej, Z., Cherif, A. (2019). Hybrid Feature Extraction Techniques Using TEO-PWP for Enhancement of Automatic Speech Recognition in Real Noisy Environment. In: Benavente-Peces, C., Slama, S., Zafar, B. (eds) Proceedings of the 1st International Conference on Smart Innovation, Ergonomics and Applied Human Factors (SEAHF). SEAHF 2019. Smart Innovation, Systems and Technologies, vol 150. Springer, Cham. https://doi.org/10.1007/978-3-030-22964-1_20

Download citation

DOI: https://doi.org/10.1007/978-3-030-22964-1_20
Published: 21 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22963-4
Online ISBN: 978-3-030-22964-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics