A Research on Speech Enhancement Based on Hybrid Parallel Subbands HMM and Neural Network Model

lv, Zhao; Ni, Li; Chen, Shiyu; Wu, Xiaopei

doi:10.1007/978-3-642-41407-7_11

A Research on Speech Enhancement Based on Hybrid Parallel Subbands HMM and Neural Network Model

Zhao lv⁶,
Li Ni⁶,
Shiyu Chen⁶ &
…
Xiaopei Wu⁶

Conference paper
First Online: 20 November 2013

1597 Accesses

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 278))

Abstract

Robustness is a very important issue in the field of automatic speech recognition (ASR) research, especially to provide high recognition accuracy in practical applications. However, the recognition rate based on the traditional whole frequency band HMM will decrease when only partial frequency bands are corrupted by noise. In order to solve this problem, a speech enhancement algorithm based on hybrid parallel subbands HMM and neural network model was proposed. The whole frequency band HMM was split into a few subbands HMM and extract some new feature parameters according to all subbands HMM outputs, and then merged them by BP neural network to yield a global recognition decision. The experimental results show that the hybrid parallel subbands HMM and neural network (PSHMM/NN) model can improve the robustness of speech recognition system in noisy environments.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

San-Segundo R, Martínez-Hinarejos CD, Ortega A (2012) Review of research on speech technology: main contributions from Spanish research groups. J Speech Sci 1(1):31–53
Google Scholar
Ishizuka K, Miyazaki N (2004) Speech feature extraction method representing periodicity and aperiodicity in sub bands for robust speech recognition. Acoust Speech Sig Processing, 2004. Proceedings of (ICASSP’04). IEEE international conference on, 2004, I- 141-4 vol. 1
Google Scholar
Hu Y, Loizou PC (2007) A comparative intelligibility study of single-microphone noise reduction algorithms. J Acoust Soc Am 122:1777
Google Scholar
Fan H, Tsai Y, Hung J (2012) Enhancing the sub-band modulation spectra of speech features via nonnegative matrix factorization for robust speech recognition. System science and engineering (ICSSE), 2012 international conference on IEEE, pp 179–182
Google Scholar
Bourlard H, Dupont S (1996) A new ASR approach based on independent processing and recombination of partial frequency bands. Spoken language, 1996, vol 1. Proceedings of fourth International Conference on IEEE, 1996, ICSLP 96, pp 426–429
Google Scholar
Hennansky H, Tibrewala S, Pavel M (1996) Towards ASR on partially corrupted speech. Spoken language, 1996, vol 1. Proceedings of fourth international conference on IEEE, 1996, ICSLP 96, pp 462–465
Google Scholar
Wang S S, Hung J, Tsao Y (2012) A study on cepstral sub-band normalization for robust ASR. Chinese spoken language Processing (ISCSLP), 2012 8th International Symposium on IEEE, 2012: 141–145
Google Scholar

Download references

Acknowledgments

The research work described in this paper was supported by Anhui University Academic and Technical Leaders Introduce Engineering Foundation (02303203), National Nature Science Foundation (61271352), and Training Program on Anhui University College Students Innovation Experiment (KYXL2012058).

Author information

Authors and Affiliations

The Key Laboratory of Intelligent Computing and Signal Processing, Anhui University, Hefei, 230039, China
Zhao lv, Li Ni, Shiyu Chen & Xiaopei Wu

Authors

Zhao lv
View author publications
You can also search for this author in PubMed Google Scholar
Li Ni
View author publications
You can also search for this author in PubMed Google Scholar
Shiyu Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xiaopei Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaopei Wu .

Editor information

Editors and Affiliations

Electrical and Computer Engineering, University of Louisville, Kentucky, Kentucky, USA
Aly A. Farag
Department of Electronic Engineering, Tsinghua University, Beijing, People's Republic of China
Jian Yang
Nanjing University of Information Science & Technology, Nanjing, People's Republic of China
Feng Jiao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

lv, Z., Ni, L., Chen, S., Wu, X. (2014). A Research on Speech Enhancement Based on Hybrid Parallel Subbands HMM and Neural Network Model. In: Farag, A., Yang, J., Jiao, F. (eds) Proceedings of the 3rd International Conference on Multimedia Technology (ICMT 2013). Lecture Notes in Electrical Engineering, vol 278. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41407-7_11

Download citation

DOI: https://doi.org/10.1007/978-3-642-41407-7_11
Published: 20 November 2013
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41406-0
Online ISBN: 978-3-642-41407-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics