Skip to main content

Feature Compensation Employing Model Combination for Robust In-Vehicle Speech Recognition

  • Chapter
  • First Online:
  • 1085 Accesses

Abstract

An effective feature compensation method is evaluated for reliable speech recognition in real-life in-vehicle environments. The CU-Move corpus, previously collected by RSPG (currently, CRSS) (http://www.utdallas.edu/research/utdrive; Hansen et al., DSP for In-Vehicle and Mobile Systems, 2004), contains a range of speech and noise signals collected for a number of speakers from across the United States under actual driving conditions. PCGMM (parallel combined Gaussian mixture model)-based feature compensation (Kim et al., Eurospeech 2003, 2003; Kim et al., ICASSP 2004, 2004), considered in this chapter, utilizes parallel model combination to generate noise-corrupted speech models by combining clean speech and noise models. In order to address unknown time-varying background noise, an interpolation method of multiple environmental models is employed. To alleviate computational expenses due to multiple models, a noise transition model is proposed, which is motivated from the noise language modeling concept developed in Environmental Sniffing (Akbacak and Hansen, IEEE Trans Audio Speech Lang Process, 15(2): 465–477, 2007). The PCGMM method and the proposed scheme are evaluated on the connected single digits portion of the CU-Move database using the Aurora2 evaluation toolkit. Experimental results indicate that our feature compensation method is effective for improving speech recognition in real-life in-vehicle conditions. Here, a 26.78% computational reduction was obtained by employing the noise transition model with only a slight change in overall recognition performance. The resulting system therefore demonstrates an effective speech recognition strategy for robust speech recognition for noisy in-vehicle environments.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. http://www.utdallas.edu/research/utdrive.

  2. J.H.L. Hansen, X. Zhang, M. Akbacak, U. Yapanel, B. Pellom, W. Ward, and P. Angkititrakul, “CU-Move: Advanced In-Vehicle Speech Systems for Route Navigation,” DSP for In-Vehicle and Mobile Systems, Springer, 2004.

    Google Scholar 

  3. X. Zhang and J.H.L. Hansen, “CSA-BF: A Constrained Switched Adaptive Beamformer for Speech Enhancement and Recognition in Real Car Environments,” IEEE Transactions on Speech and Audio Processing, vol. 11, no. 6, pp. 733–745, 2003.

    Article  Google Scholar 

  4. W. Kim, S. Ahn, and H. Ko, “Feature Compensation Scheme Based on Parallel Combined Mixture Model,” Eurospeech2003, pp. 667–680, 2003.

    Google Scholar 

  5. W. Kim, O. Kwon, and H. Ko, “PCMM-Based Feature Compensation Scheme Using Model Interpolation and Mixture Sharing,” ICASSP2004, pp. 989–992, 2004.

    Google Scholar 

  6. M. Akbacak and J.H.L. Hansen, “Environmental Sniffing: Noise Knowledge Estimation for Robust Speech Systems,” IEEE Transactions on Audio, Speech and Language Processing, vol.15, no 2, pp. 465–477, 2007.

    Article  Google Scholar 

  7. A.P. Varga and R.K. Moore, “Hidden Markov Model Decomposition of Speech and Noise,” ICASSP90, pp. 845–848, 1990.

    Google Scholar 

  8. M.J.F. Gales and S.J. Young, “Robust Continuous Speech Recognition Using Parallel Model Combination,” IEEE Transactions on Speech and Audio Processing, vol. 4, no. 5, pp. 352–359, 1996.

    Article  Google Scholar 

  9. P.J. Moreno, B. Raj, and R.M. Stern, “Data-Driven Environmental Compensation for Speech Recognition: A Unified Approach,” Speech Communication, vol. 24, pp. 267–85, 1998.

    Article  Google Scholar 

  10. H.G. Hirsch and D. Pearce, “The AURORA Experimental Framework for the Performance Evaluations of Speech Recognition Systems Under Noisy Conditions,” ISCA ITRW ASR2000, Sept. 2000.

    Google Scholar 

  11. ETSI Standard Doc., “Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Front-End Feature Extraction Algorithm; Compression Algorithms,” ETSI ES 201 108 v1.1.2 (2000–04), 2000.

    Google Scholar 

  12. NIST SPeech Quality Assurance (SPQA) package version 2.3, http://www.nist.gov/speech.

  13. ETSI Standard Doc., “Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Advanced Front-End Feature Extraction Algorithm; Compression Algorithms,” ETSI ES 202 050 v1.1.1 (2002–10), 2002.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to John H. L. Hansen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Kim, W., Hansen, J.H.L. (2009). Feature Compensation Employing Model Combination for Robust In-Vehicle Speech Recognition. In: Takeda, K., Erdogan, H., Hansen, J.H.L., Abut, H. (eds) In-Vehicle Corpus and Signal Processing for Driver Behavior. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-79582-9_19

Download citation

  • DOI: https://doi.org/10.1007/978-0-387-79582-9_19

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-79581-2

  • Online ISBN: 978-0-387-79582-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics