Skip to main content

CU-Move: Advanced In-Vehicle Speech Systems for Route Navigation

  • Chapter
DSP for In-Vehicle and Mobile Systems

Abstract

In this chapter, we present our recent advances in the formulation and development of an in-vehicle hands-free route navigation system. The system is comprised of a multi-microphone array processing front-end, environmental sniffer (for noise analysis), robust speech recognition system, and dialog manager and information servers. We also present our recently completed speech corpus for in-vehicle interactive speech systems for route planning and navigation. The corpus consists of five domains which include: digit strings, route navigation expressions, street and location sentences, phonetically balanced sentences, and a route navigation dialog in a human Wizard-of-Oz like scenario. A total of 500 speakers were collected from across the United States of America during a six month period from April-Sept. 2001. While previous attempts at in-vehicle speech systems have generally focused on isolated command words to set radio frequencies, temperature control, etc., the CU-Move system is focused on natural conversational interaction between the user and in-vehicle system. After presenting our proposed in-vehicle speech system, we consider advances in multi-channel array processing, environmental noise sniffing and tracking, new and more robust acoustic front-end representations and built-in speaker normalization for robust ASR, and our back-end dialog navigation information retrieval sub-system connected to the WWW. Results are presented in each sub-section with a discussion at the end of the chapter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. W. Ward, B. Pellom, “The CU Communicator System,” Proc. IEEE Work. Auto. Speech Recog. & Under., Keystone Colorado, 1999.

    Google Scholar 

  2. J.H.L. Hansen, M.A. Clements, “Constrained Iterative Speech Enhancement with Application to Speech Recognition,” IEEE Trans. Signal Processing, 39(4):795–805, 1991.

    Article  Google Scholar 

  3. B. Pellom, J.H.L. Hansen, “An Improved Constrained Iterative Speech Enhancement Algorithm for Colored Noise Environments,” IEEE Trans. Speech & Audio Proc., 6(6):573–79, 1998.

    Google Scholar 

  4. P. Lockwood, J. Boudy, “Experiments with a Nonlinear Spectral Subtractor (NSS), HMMs and the projection, for robust speech recognition in cars,” Speech Communication, 11:215–228, 1992.

    Article  Google Scholar 

  5. S. Riis, O. Viikki, “Low Complexity Speaker Independent Command Word Recognition in Car Environments, IEEE ICASSP-00, 3:1743–6, Istanbul, Turkey, 2000.

    Google Scholar 

  6. J. Huang, Y. Zhao, S. Levinson, “A DCT-based Fast Enhancement Technique for Robust Speech Recognition in Automobile Usage,” EUROSPEECH-99, 5:1947–50, Budapest, Hungary, 1999.

    Google Scholar 

  7. R. Bippus, A. Fischer, V. Stahl, “Domain Adaptation for Robust Automatic Speech Recognition in Car Environments,” EUROSPEECH-99, 5:1943–6, Budapest, Hungary, 1999.

    Google Scholar 

  8. A. Fischer, V. Stahl, “Database And Online Adaptation For Improved Speech Recognition In Car Environments,” IEEE ICASSP-99, Phoenix, AZ, 1999.

    Google Scholar 

  9. L.S. Huang, C.H. Yang, “A Novel Approach to Robust Speech Endpoint Detection in Car Environments,” IEEE ICASSP-00, 3:1751–4, Istanbul, Turkey, 2000.

    Google Scholar 

  10. E. Ambikairajah, G. Tattersall, A. Davis, “Wavelet Transform-based Speech Enhancement,” ICSLP-98, 7:2811–14, Sydney, Australia, 1998.

    Google Scholar 

  11. P. Gelin, J.-C. Junqua, “Techniques for Robust Speech Recognition in the Car Environment,” EUROSPEECH-99, 6:2483–6, Budapest, Hungary, 1999.

    Google Scholar 

  12. http://www.speechdat.com/SP-CAR/

    Google Scholar 

  13. P. Pollák, J. Vopièka, P. Sovka, “Czech Language Database of Car Speech and Environmental Noise,” EUROSPEECH-99, 5:2263–6, Budapest, Hungary, 1999.

    Google Scholar 

  14. P. Geutner, M. Denecke, U. Meier, M. Westphal, A. Waibel, “Conversational Speech Systems For On-Board Car Navigation and Assistance,” ICSLP-98, paper # 772, Sydney, Australia, 1998.

    Google Scholar 

  15. M. Westphal, A. Waibel, “Towards Spontaneous Speech Recognition for On-Board Car Navigation and Information Systems,” EUROSPEECH-99, 5:1955–8, Budapest, Hungary, 1999.

    Google Scholar 

  16. J.H.L. Hansen, “Analysis and Compensation of Speech under Stress and Noise for Environmental Robustness in Speech Recognition,” Speech Comm., pp 151–170, Nov. 1996.

    Google Scholar 

  17. B. Pellom, W. Ward, S. Pradhan, “The CU Communicator: an Architecture for Dialogue Systems,” ICSLP-2000, Beijing, China, Oct. 2000.

    Google Scholar 

  18. J.F. Kasier, “On a Simple Algorithm to Calculate the ‘Energy’ of a Signal”, IEEE ICASSP-90, pp. 381–384, 1990.

    Google Scholar 

  19. X. Zhang, J.H.L. Hansen, “CSA-BF: Novel Constrained Switched Adaptive Beamforming for Speech Enhancement & Recognition in Real Car Environments”, IEEE ICASSP-03. pp. 125–128, Hong Kong, China, April 2003.

    Google Scholar 

  20. P. L. Feintuch, N. J. Bershad, and F. A. Reed, “Time delay Estimation Using the LMS Adaptive Filter-Static Behavior”, IEEE Trans. Acoustics, Speech, Signal Proc., ASSP-29(3):571–576, June 1981.

    Google Scholar 

  21. J.H.L. Hansen, et.al., “CU-Move”: Analysis & Corpus Develop. for Interactive In-vehicle Speech Systems”, Eurospeech-01, pp. 2023–2026, Aalborg, Denmark, 2001.

    Google Scholar 

  22. http://www.nist.gov

    Google Scholar 

  23. Pellom, “Sonic: The University of Colorado Continuous Speech Recognizer”, University of Colorado, Technical Report# TR-CSLR-2001-01, Boulder, Colorado, March, 2001.

    Google Scholar 

  24. M. Akbacak, J.H.L. Hansen, “Environmental Sniffing: Noise Knowledge Estimation for Robust Speech Systems,” IEEE ICASSP-2003, pp. 113–116, Hong Kong, China, April 2003.

    Google Scholar 

  25. B. Pellom, K. Hacioglu, “Recent Improvements in the CU Sonic ASR System for Noisy Speech,” ICASSP-2003, Hong Kong, China, April 2003.

    Google Scholar 

  26. J.H.L. Hansen, “Getting Started with the CU-Move Corpus”, Release 2.0A Technical Report, 44pgs., Nov. 17, 2002 [see http://cumove.colorado.edu/].

    Google Scholar 

  27. http://cumove.colorado.edu/

    Google Scholar 

  28. U. Yapanel, X. Zhang, J.H.L. Hansen, “High Performance Digit Recognition in Real Car Environments,” ICSLP-2002, vol. 2, pp. 793–796, Denver, CO.

    Google Scholar 

  29. http://speechfind.colorado.edu/

    Google Scholar 

  30. http://speechbot.research.compaq.com/

    Google Scholar 

  31. J.H.L. Hansen, C. Swail, A.J. South, R.K. Moore, H. Steeneken, E.J. Cupples, T. Anderson, C.R.A. Vloeberghs, I. Trancoso, P. Verlinde, “The Impact of Speech Under ’stress’ on Military Speech Technology,” NATO RTO-TR-10, AC/323(IST)TP/5 IST/TG-01, March 2000.

    Google Scholar 

  32. M. Akbacak, J.H.L. Hansen, “Environmental Sniffing: Robust Digit Recognition for an In-Vehicle Environment,” Eurospeech-03, pp. 2177–2180, Geneva, Switzerland, Sept. 2003.

    Google Scholar 

  33. M. J. Hunt, “Spectral Signal Processing for ASR”, Proc ASRU’99, Keystone, Colorado, USA

    Google Scholar 

  34. L. Gu and K. Rose, “Perceptual Harmonic Cepstral Coeffiecients as the Front-End for Speech Recognition”, ICSLP-00, Beijing, China, 2000.

    Google Scholar 

  35. M. Jelinek and J.P. Adoul, “Frequency-domain Spectral Envelope Estimation for Low Rate Coding of Speech”, IEEE ICASSP-99, Phoenix, Arizona, 1999.

    Google Scholar 

  36. M.N. Murthi and B.D. Rao, “All-pole Modeling of Speech Based on the Minimum Variance Distortionless Response Spectrum”, IEEE Trans. Speech & Audio Processing, May 2000.

    Google Scholar 

  37. U.H. Yapanel and J.H.L. Hansen, “A New Perspective on Feature Extraction for Robust In-vehicle Speech Recognition”, Eurospeech-03, pp. 1281–1284, Geneva, Switzerland, Sept. 2003.

    Google Scholar 

  38. S. Dharanipragada and B.D. Rao, “MVDR-based Feature Extraction for Robust Speech Recognition”, IEEE ICASSP-01, Salt Lake City, Utah, 2001.

    Google Scholar 

  39. U.H. Yapanel and S. Dharanipragada, “Perceptual MVDR-based Cepstral Coefficients for Noise Robust Speech Recognition”, IEEE ICASSP-03, Hong Kong, China, April 2003.

    Google Scholar 

  40. U.H. Yapanel, S. Dharanipragada, J.H.L. Hansen, “Perceptual MVDR-based Cepstral Coefficients for High-accuracy Speech Recognition”, Eurospeech-03, pp. 1425–1428, Geneva, Switzerland, Sept. 2003.

    Google Scholar 

  41. S.L. Marple, Jr, “Digital Spectral Analysis with Applications”, Prentice-Hall, Englewood Cliffs, NJ, 1987

    Google Scholar 

  42. K. Tokuda, T. Masuko, T. Kobayashi, and S. Imai, “Mel-generalized Cepstral Analysis-A Unified Approach to Speech Spectral Estimation”, ICSLP-94, Yokohama, Japan, 1994.

    Google Scholar 

  43. L.F. Uebel and P.C. Woodland, “An Investigation into Vocal Tract Length Normalization”, Eurospeech-99, Budapest, Hungary, 1999.

    Google Scholar 

  44. J. McDonough, W. Byrne, and X. Luo, “Speaker Normalization with All-pass Transforms”, ICSLP-98, Sydney, Australia, 1998.

    Google Scholar 

  45. S.E. Bou-Ghazale, J.H.L. Hansen, “A Comparative Study of Traditional and Newly Proposed Features for Recognition of Speech Under Stress,” IEEE Trans. Speech & Audio Proc., 8(4):429–442, July 2000.

    Google Scholar 

  46. B. Pellom, W. Ward, J.H.L. Hansen, K. Hacioglu, J. Zhang, X. Yu, S. Pradhan, “University of Colorado Dialog Systems for Travel and Navigation”, in Human Language Technology Conference (HLT), San Diego, California, March, 2001.

    Google Scholar 

  47. URL: Galaxy Communicator Software, http://communicator.sourceforge.net

    Google Scholar 

  48. URL: University of Colorado SONIC LVCSR System http://cslr.colorado.edu/beginweb/speech_recognition/sonic.html

    Google Scholar 

  49. S. Seneff, E. Hurley, R. Lau, C. Pao, P. Schmid, V. Zue, “Galaxy-II: A Reference Architecture for Conversational System Development,” Proc. ICSLP, Sydney Australia, Vol. 3, pp. 931–934, 1998.

    Google Scholar 

  50. X. Zhang, J.H.L. Hansen, “CSA-BF: Novel Constrained Switched Adaptive Beamforming for Speech Enhancement & Recognition in Real Car Environments”, IEEE Trans. on Speech & Audio Processing, vol. 11, pp. 733–745, Nov. 2003.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer Science + Business Media, Inc.

About this chapter

Cite this chapter

Hansen, J.H. et al. (2005). CU-Move: Advanced In-Vehicle Speech Systems for Route Navigation. In: Abut, H., Hansen, J.H., Takeda, K. (eds) DSP for In-Vehicle and Mobile Systems. Springer, Boston, MA. https://doi.org/10.1007/0-387-22979-5_2

Download citation

  • DOI: https://doi.org/10.1007/0-387-22979-5_2

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-387-22978-2

  • Online ISBN: 978-0-387-22979-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics