Skip to main content

A Research on Speech Enhancement Based on Hybrid Parallel Subbands HMM and Neural Network Model

  • Conference paper
  • First Online:
  • 1597 Accesses

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 278))

Abstract

Robustness is a very important issue in the field of automatic speech recognition (ASR) research, especially to provide high recognition accuracy in practical applications. However, the recognition rate based on the traditional whole frequency band HMM will decrease when only partial frequency bands are corrupted by noise. In order to solve this problem, a speech enhancement algorithm based on hybrid parallel subbands HMM and neural network model was proposed. The whole frequency band HMM was split into a few subbands HMM and extract some new feature parameters according to all subbands HMM outputs, and then merged them by BP neural network to yield a global recognition decision. The experimental results show that the hybrid parallel subbands HMM and neural network (PSHMM/NN) model can improve the robustness of speech recognition system in noisy environments.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. San-Segundo R, Martínez-Hinarejos CD, Ortega A (2012) Review of research on speech technology: main contributions from Spanish research groups. J Speech Sci 1(1):31–53

    Google Scholar 

  2. Ishizuka K, Miyazaki N (2004) Speech feature extraction method representing periodicity and aperiodicity in sub bands for robust speech recognition. Acoust Speech Sig Processing, 2004. Proceedings of (ICASSP’04). IEEE international conference on, 2004, I- 141-4 vol. 1

    Google Scholar 

  3. Hu Y, Loizou PC (2007) A comparative intelligibility study of single-microphone noise reduction algorithms. J Acoust Soc Am 122:1777

    Google Scholar 

  4. Fan H, Tsai Y, Hung J (2012) Enhancing the sub-band modulation spectra of speech features via nonnegative matrix factorization for robust speech recognition. System science and engineering (ICSSE), 2012 international conference on IEEE, pp 179–182

    Google Scholar 

  5. Bourlard H, Dupont S (1996) A new ASR approach based on independent processing and recombination of partial frequency bands. Spoken language, 1996, vol 1. Proceedings of fourth International Conference on IEEE, 1996, ICSLP 96, pp 426–429

    Google Scholar 

  6. Hennansky H, Tibrewala S, Pavel M (1996) Towards ASR on partially corrupted speech. Spoken language, 1996, vol 1. Proceedings of fourth international conference on IEEE, 1996, ICSLP 96, pp 462–465

    Google Scholar 

  7. Wang S S, Hung J, Tsao Y (2012) A study on cepstral sub-band normalization for robust ASR. Chinese spoken language Processing (ISCSLP), 2012 8th International Symposium on IEEE, 2012: 141–145

    Google Scholar 

Download references

Acknowledgments

The research work described in this paper was supported by Anhui University Academic and Technical Leaders Introduce Engineering Foundation (02303203), National Nature Science Foundation (61271352), and Training Program on Anhui University College Students Innovation Experiment (KYXL2012058).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaopei Wu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

lv, Z., Ni, L., Chen, S., Wu, X. (2014). A Research on Speech Enhancement Based on Hybrid Parallel Subbands HMM and Neural Network Model. In: Farag, A., Yang, J., Jiao, F. (eds) Proceedings of the 3rd International Conference on Multimedia Technology (ICMT 2013). Lecture Notes in Electrical Engineering, vol 278. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41407-7_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41407-7_11

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41406-0

  • Online ISBN: 978-3-642-41407-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics