CU-Move: Advanced In-Vehicle Speech Systems for Route Navigation

Hansen, John H.L.; Zhang, Xianxian; Akbacak, Murat; Yapanel, Umit H.; Pellom, Bryan; Ward, Wayne; Angkititrakul, Pongtep

doi:10.1007/0-387-22979-5_2

John H.L. Hansen⁴,
Xianxian Zhang⁴,
Murat Akbacak⁴,
Umit H. Yapanel⁴,
Bryan Pellom⁴,
Wayne Ward⁴ &
…
Pongtep Angkititrakul⁴

643 Accesses
8 Citations

Abstract

In this chapter, we present our recent advances in the formulation and development of an in-vehicle hands-free route navigation system. The system is comprised of a multi-microphone array processing front-end, environmental sniffer (for noise analysis), robust speech recognition system, and dialog manager and information servers. We also present our recently completed speech corpus for in-vehicle interactive speech systems for route planning and navigation. The corpus consists of five domains which include: digit strings, route navigation expressions, street and location sentences, phonetically balanced sentences, and a route navigation dialog in a human Wizard-of-Oz like scenario. A total of 500 speakers were collected from across the United States of America during a six month period from April-Sept. 2001. While previous attempts at in-vehicle speech systems have generally focused on isolated command words to set radio frequencies, temperature control, etc., the CU-Move system is focused on natural conversational interaction between the user and in-vehicle system. After presenting our proposed in-vehicle speech system, we consider advances in multi-channel array processing, environmental noise sniffing and tracking, new and more robust acoustic front-end representations and built-in speaker normalization for robust ASR, and our back-end dialog navigation information retrieval sub-system connected to the WWW. Results are presented in each sub-section with a discussion at the end of the chapter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

W. Ward, B. Pellom, “The CU Communicator System,” Proc. IEEE Work. Auto. Speech Recog. & Under., Keystone Colorado, 1999.
Google Scholar
J.H.L. Hansen, M.A. Clements, “Constrained Iterative Speech Enhancement with Application to Speech Recognition,” IEEE Trans. Signal Processing, 39(4):795–805, 1991.
Article Google Scholar
B. Pellom, J.H.L. Hansen, “An Improved Constrained Iterative Speech Enhancement Algorithm for Colored Noise Environments,” IEEE Trans. Speech & Audio Proc., 6(6):573–79, 1998.
Google Scholar
P. Lockwood, J. Boudy, “Experiments with a Nonlinear Spectral Subtractor (NSS), HMMs and the projection, for robust speech recognition in cars,” Speech Communication, 11:215–228, 1992.
Article Google Scholar
S. Riis, O. Viikki, “Low Complexity Speaker Independent Command Word Recognition in Car Environments, IEEE ICASSP-00, 3:1743–6, Istanbul, Turkey, 2000.
Google Scholar
J. Huang, Y. Zhao, S. Levinson, “A DCT-based Fast Enhancement Technique for Robust Speech Recognition in Automobile Usage,” EUROSPEECH-99, 5:1947–50, Budapest, Hungary, 1999.
Google Scholar
R. Bippus, A. Fischer, V. Stahl, “Domain Adaptation for Robust Automatic Speech Recognition in Car Environments,” EUROSPEECH-99, 5:1943–6, Budapest, Hungary, 1999.
Google Scholar
A. Fischer, V. Stahl, “Database And Online Adaptation For Improved Speech Recognition In Car Environments,” IEEE ICASSP-99, Phoenix, AZ, 1999.
Google Scholar
L.S. Huang, C.H. Yang, “A Novel Approach to Robust Speech Endpoint Detection in Car Environments,” IEEE ICASSP-00, 3:1751–4, Istanbul, Turkey, 2000.
Google Scholar
E. Ambikairajah, G. Tattersall, A. Davis, “Wavelet Transform-based Speech Enhancement,” ICSLP-98, 7:2811–14, Sydney, Australia, 1998.
Google Scholar
P. Gelin, J.-C. Junqua, “Techniques for Robust Speech Recognition in the Car Environment,” EUROSPEECH-99, 6:2483–6, Budapest, Hungary, 1999.
Google Scholar
http://www.speechdat.com/SP-CAR/
Google Scholar
P. Pollák, J. Vopièka, P. Sovka, “Czech Language Database of Car Speech and Environmental Noise,” EUROSPEECH-99, 5:2263–6, Budapest, Hungary, 1999.
Google Scholar
P. Geutner, M. Denecke, U. Meier, M. Westphal, A. Waibel, “Conversational Speech Systems For On-Board Car Navigation and Assistance,” ICSLP-98, paper # 772, Sydney, Australia, 1998.
Google Scholar
M. Westphal, A. Waibel, “Towards Spontaneous Speech Recognition for On-Board Car Navigation and Information Systems,” EUROSPEECH-99, 5:1955–8, Budapest, Hungary, 1999.
Google Scholar
J.H.L. Hansen, “Analysis and Compensation of Speech under Stress and Noise for Environmental Robustness in Speech Recognition,” Speech Comm., pp 151–170, Nov. 1996.
Google Scholar
B. Pellom, W. Ward, S. Pradhan, “The CU Communicator: an Architecture for Dialogue Systems,” ICSLP-2000, Beijing, China, Oct. 2000.
Google Scholar
J.F. Kasier, “On a Simple Algorithm to Calculate the ‘Energy’ of a Signal”, IEEE ICASSP-90, pp. 381–384, 1990.
Google Scholar
X. Zhang, J.H.L. Hansen, “CSA-BF: Novel Constrained Switched Adaptive Beamforming for Speech Enhancement & Recognition in Real Car Environments”, IEEE ICASSP-03. pp. 125–128, Hong Kong, China, April 2003.
Google Scholar
P. L. Feintuch, N. J. Bershad, and F. A. Reed, “Time delay Estimation Using the LMS Adaptive Filter-Static Behavior”, IEEE Trans. Acoustics, Speech, Signal Proc., ASSP-29(3):571–576, June 1981.
Google Scholar
J.H.L. Hansen, et.al., “CU-Move”: Analysis & Corpus Develop. for Interactive In-vehicle Speech Systems”, Eurospeech-01, pp. 2023–2026, Aalborg, Denmark, 2001.
Google Scholar
http://www.nist.gov
Google Scholar
Pellom, “Sonic: The University of Colorado Continuous Speech Recognizer”, University of Colorado, Technical Report# TR-CSLR-2001-01, Boulder, Colorado, March, 2001.
Google Scholar
M. Akbacak, J.H.L. Hansen, “Environmental Sniffing: Noise Knowledge Estimation for Robust Speech Systems,” IEEE ICASSP-2003, pp. 113–116, Hong Kong, China, April 2003.
Google Scholar
B. Pellom, K. Hacioglu, “Recent Improvements in the CU Sonic ASR System for Noisy Speech,” ICASSP-2003, Hong Kong, China, April 2003.
Google Scholar
J.H.L. Hansen, “Getting Started with the CU-Move Corpus”, Release 2.0A Technical Report, 44pgs., Nov. 17, 2002 [see http://cumove.colorado.edu/].
Google Scholar
http://cumove.colorado.edu/
Google Scholar
U. Yapanel, X. Zhang, J.H.L. Hansen, “High Performance Digit Recognition in Real Car Environments,” ICSLP-2002, vol. 2, pp. 793–796, Denver, CO.
Google Scholar
http://speechfind.colorado.edu/
Google Scholar
http://speechbot.research.compaq.com/
Google Scholar
J.H.L. Hansen, C. Swail, A.J. South, R.K. Moore, H. Steeneken, E.J. Cupples, T. Anderson, C.R.A. Vloeberghs, I. Trancoso, P. Verlinde, “The Impact of Speech Under ’stress’ on Military Speech Technology,” NATO RTO-TR-10, AC/323(IST)TP/5 IST/TG-01, March 2000.
Google Scholar
M. Akbacak, J.H.L. Hansen, “Environmental Sniffing: Robust Digit Recognition for an In-Vehicle Environment,” Eurospeech-03, pp. 2177–2180, Geneva, Switzerland, Sept. 2003.
Google Scholar
M. J. Hunt, “Spectral Signal Processing for ASR”, Proc ASRU’99, Keystone, Colorado, USA
Google Scholar
L. Gu and K. Rose, “Perceptual Harmonic Cepstral Coeffiecients as the Front-End for Speech Recognition”, ICSLP-00, Beijing, China, 2000.
Google Scholar
M. Jelinek and J.P. Adoul, “Frequency-domain Spectral Envelope Estimation for Low Rate Coding of Speech”, IEEE ICASSP-99, Phoenix, Arizona, 1999.
Google Scholar
M.N. Murthi and B.D. Rao, “All-pole Modeling of Speech Based on the Minimum Variance Distortionless Response Spectrum”, IEEE Trans. Speech & Audio Processing, May 2000.
Google Scholar
U.H. Yapanel and J.H.L. Hansen, “A New Perspective on Feature Extraction for Robust In-vehicle Speech Recognition”, Eurospeech-03, pp. 1281–1284, Geneva, Switzerland, Sept. 2003.
Google Scholar
S. Dharanipragada and B.D. Rao, “MVDR-based Feature Extraction for Robust Speech Recognition”, IEEE ICASSP-01, Salt Lake City, Utah, 2001.
Google Scholar
U.H. Yapanel and S. Dharanipragada, “Perceptual MVDR-based Cepstral Coefficients for Noise Robust Speech Recognition”, IEEE ICASSP-03, Hong Kong, China, April 2003.
Google Scholar
U.H. Yapanel, S. Dharanipragada, J.H.L. Hansen, “Perceptual MVDR-based Cepstral Coefficients for High-accuracy Speech Recognition”, Eurospeech-03, pp. 1425–1428, Geneva, Switzerland, Sept. 2003.
Google Scholar
S.L. Marple, Jr, “Digital Spectral Analysis with Applications”, Prentice-Hall, Englewood Cliffs, NJ, 1987
Google Scholar
K. Tokuda, T. Masuko, T. Kobayashi, and S. Imai, “Mel-generalized Cepstral Analysis-A Unified Approach to Speech Spectral Estimation”, ICSLP-94, Yokohama, Japan, 1994.
Google Scholar
L.F. Uebel and P.C. Woodland, “An Investigation into Vocal Tract Length Normalization”, Eurospeech-99, Budapest, Hungary, 1999.
Google Scholar
J. McDonough, W. Byrne, and X. Luo, “Speaker Normalization with All-pass Transforms”, ICSLP-98, Sydney, Australia, 1998.
Google Scholar
S.E. Bou-Ghazale, J.H.L. Hansen, “A Comparative Study of Traditional and Newly Proposed Features for Recognition of Speech Under Stress,” IEEE Trans. Speech & Audio Proc., 8(4):429–442, July 2000.
Google Scholar
B. Pellom, W. Ward, J.H.L. Hansen, K. Hacioglu, J. Zhang, X. Yu, S. Pradhan, “University of Colorado Dialog Systems for Travel and Navigation”, in Human Language Technology Conference (HLT), San Diego, California, March, 2001.
Google Scholar
URL: Galaxy Communicator Software, http://communicator.sourceforge.net
Google Scholar
URL: University of Colorado SONIC LVCSR System http://cslr.colorado.edu/beginweb/speech_recognition/sonic.html
Google Scholar
S. Seneff, E. Hurley, R. Lau, C. Pao, P. Schmid, V. Zue, “Galaxy-II: A Reference Architecture for Conversational System Development,” Proc. ICSLP, Sydney Australia, Vol. 3, pp. 931–934, 1998.
Google Scholar
X. Zhang, J.H.L. Hansen, “CSA-BF: Novel Constrained Switched Adaptive Beamforming for Speech Enhancement & Recognition in Real Car Environments”, IEEE Trans. on Speech & Audio Processing, vol. 11, pp. 733–745, Nov. 2003.
Google Scholar

Download references

Author information

Authors and Affiliations

Robust Speech Processing Group, Center for Spoken Language Research, University of Colorado at Boulder, Boulder, Colorado, 80309-0594, USA
John H.L. Hansen, Xianxian Zhang, Murat Akbacak, Umit H. Yapanel, Bryan Pellom, Wayne Ward & Pongtep Angkititrakul

Authors

John H.L. Hansen
View author publications
You can also search for this author in PubMed Google Scholar
Xianxian Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Murat Akbacak
View author publications
You can also search for this author in PubMed Google Scholar
Umit H. Yapanel
View author publications
You can also search for this author in PubMed Google Scholar
Bryan Pellom
View author publications
You can also search for this author in PubMed Google Scholar
Wayne Ward
View author publications
You can also search for this author in PubMed Google Scholar
Pongtep Angkititrakul
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electrical and Computer Engineering, San Diego State University, San Diego, California, USA
Hüseyin Abut
Robust Speech Processing Group, Center for Spoken Language Research Dept. Speech, Language & Hearing Sciences, Dept. Electrical Engineering, University of Colorado, Boulder, Colorado, USA
John H.L. Hansen
Department of Media Science, Nagoya University, Nagoya, Japan
Kazuya Takeda

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Hansen, J.H. et al. (2005). CU-Move: Advanced In-Vehicle Speech Systems for Route Navigation. In: Abut, H., Hansen, J.H., Takeda, K. (eds) DSP for In-Vehicle and Mobile Systems. Springer, Boston, MA. https://doi.org/10.1007/0-387-22979-5_2

Download citation

DOI: https://doi.org/10.1007/0-387-22979-5_2
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-22978-2
Online ISBN: 978-0-387-22979-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics