Abstract
In this chapter, we present our recent advances in the formulation and development of an in-vehicle hands-free route navigation system. The system is comprised of a multi-microphone array processing front-end, environmental sniffer (for noise analysis), robust speech recognition system, and dialog manager and information servers. We also present our recently completed speech corpus for in-vehicle interactive speech systems for route planning and navigation. The corpus consists of five domains which include: digit strings, route navigation expressions, street and location sentences, phonetically balanced sentences, and a route navigation dialog in a human Wizard-of-Oz like scenario. A total of 500 speakers were collected from across the United States of America during a six month period from April-Sept. 2001. While previous attempts at in-vehicle speech systems have generally focused on isolated command words to set radio frequencies, temperature control, etc., the CU-Move system is focused on natural conversational interaction between the user and in-vehicle system. After presenting our proposed in-vehicle speech system, we consider advances in multi-channel array processing, environmental noise sniffing and tracking, new and more robust acoustic front-end representations and built-in speaker normalization for robust ASR, and our back-end dialog navigation information retrieval sub-system connected to the WWW. Results are presented in each sub-section with a discussion at the end of the chapter.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
W. Ward, B. Pellom, “The CU Communicator System,” Proc. IEEE Work. Auto. Speech Recog. & Under., Keystone Colorado, 1999.
J.H.L. Hansen, M.A. Clements, “Constrained Iterative Speech Enhancement with Application to Speech Recognition,” IEEE Trans. Signal Processing, 39(4):795–805, 1991.
B. Pellom, J.H.L. Hansen, “An Improved Constrained Iterative Speech Enhancement Algorithm for Colored Noise Environments,” IEEE Trans. Speech & Audio Proc., 6(6):573–79, 1998.
P. Lockwood, J. Boudy, “Experiments with a Nonlinear Spectral Subtractor (NSS), HMMs and the projection, for robust speech recognition in cars,” Speech Communication, 11:215–228, 1992.
S. Riis, O. Viikki, “Low Complexity Speaker Independent Command Word Recognition in Car Environments, IEEE ICASSP-00, 3:1743–6, Istanbul, Turkey, 2000.
J. Huang, Y. Zhao, S. Levinson, “A DCT-based Fast Enhancement Technique for Robust Speech Recognition in Automobile Usage,” EUROSPEECH-99, 5:1947–50, Budapest, Hungary, 1999.
R. Bippus, A. Fischer, V. Stahl, “Domain Adaptation for Robust Automatic Speech Recognition in Car Environments,” EUROSPEECH-99, 5:1943–6, Budapest, Hungary, 1999.
A. Fischer, V. Stahl, “Database And Online Adaptation For Improved Speech Recognition In Car Environments,” IEEE ICASSP-99, Phoenix, AZ, 1999.
L.S. Huang, C.H. Yang, “A Novel Approach to Robust Speech Endpoint Detection in Car Environments,” IEEE ICASSP-00, 3:1751–4, Istanbul, Turkey, 2000.
E. Ambikairajah, G. Tattersall, A. Davis, “Wavelet Transform-based Speech Enhancement,” ICSLP-98, 7:2811–14, Sydney, Australia, 1998.
P. Gelin, J.-C. Junqua, “Techniques for Robust Speech Recognition in the Car Environment,” EUROSPEECH-99, 6:2483–6, Budapest, Hungary, 1999.
http://www.speechdat.com/SP-CAR/
P. Pollák, J. Vopièka, P. Sovka, “Czech Language Database of Car Speech and Environmental Noise,” EUROSPEECH-99, 5:2263–6, Budapest, Hungary, 1999.
P. Geutner, M. Denecke, U. Meier, M. Westphal, A. Waibel, “Conversational Speech Systems For On-Board Car Navigation and Assistance,” ICSLP-98, paper # 772, Sydney, Australia, 1998.
M. Westphal, A. Waibel, “Towards Spontaneous Speech Recognition for On-Board Car Navigation and Information Systems,” EUROSPEECH-99, 5:1955–8, Budapest, Hungary, 1999.
J.H.L. Hansen, “Analysis and Compensation of Speech under Stress and Noise for Environmental Robustness in Speech Recognition,” Speech Comm., pp 151–170, Nov. 1996.
B. Pellom, W. Ward, S. Pradhan, “The CU Communicator: an Architecture for Dialogue Systems,” ICSLP-2000, Beijing, China, Oct. 2000.
J.F. Kasier, “On a Simple Algorithm to Calculate the ‘Energy’ of a Signal”, IEEE ICASSP-90, pp. 381–384, 1990.
X. Zhang, J.H.L. Hansen, “CSA-BF: Novel Constrained Switched Adaptive Beamforming for Speech Enhancement & Recognition in Real Car Environments”, IEEE ICASSP-03. pp. 125–128, Hong Kong, China, April 2003.
P. L. Feintuch, N. J. Bershad, and F. A. Reed, “Time delay Estimation Using the LMS Adaptive Filter-Static Behavior”, IEEE Trans. Acoustics, Speech, Signal Proc., ASSP-29(3):571–576, June 1981.
J.H.L. Hansen, et.al., “CU-Move”: Analysis & Corpus Develop. for Interactive In-vehicle Speech Systems”, Eurospeech-01, pp. 2023–2026, Aalborg, Denmark, 2001.
http://www.nist.gov
Pellom, “Sonic: The University of Colorado Continuous Speech Recognizer”, University of Colorado, Technical Report# TR-CSLR-2001-01, Boulder, Colorado, March, 2001.
M. Akbacak, J.H.L. Hansen, “Environmental Sniffing: Noise Knowledge Estimation for Robust Speech Systems,” IEEE ICASSP-2003, pp. 113–116, Hong Kong, China, April 2003.
B. Pellom, K. Hacioglu, “Recent Improvements in the CU Sonic ASR System for Noisy Speech,” ICASSP-2003, Hong Kong, China, April 2003.
J.H.L. Hansen, “Getting Started with the CU-Move Corpus”, Release 2.0A Technical Report, 44pgs., Nov. 17, 2002 [see http://cumove.colorado.edu/].
http://cumove.colorado.edu/
U. Yapanel, X. Zhang, J.H.L. Hansen, “High Performance Digit Recognition in Real Car Environments,” ICSLP-2002, vol. 2, pp. 793–796, Denver, CO.
http://speechfind.colorado.edu/
http://speechbot.research.compaq.com/
J.H.L. Hansen, C. Swail, A.J. South, R.K. Moore, H. Steeneken, E.J. Cupples, T. Anderson, C.R.A. Vloeberghs, I. Trancoso, P. Verlinde, “The Impact of Speech Under ’stress’ on Military Speech Technology,” NATO RTO-TR-10, AC/323(IST)TP/5 IST/TG-01, March 2000.
M. Akbacak, J.H.L. Hansen, “Environmental Sniffing: Robust Digit Recognition for an In-Vehicle Environment,” Eurospeech-03, pp. 2177–2180, Geneva, Switzerland, Sept. 2003.
M. J. Hunt, “Spectral Signal Processing for ASR”, Proc ASRU’99, Keystone, Colorado, USA
L. Gu and K. Rose, “Perceptual Harmonic Cepstral Coeffiecients as the Front-End for Speech Recognition”, ICSLP-00, Beijing, China, 2000.
M. Jelinek and J.P. Adoul, “Frequency-domain Spectral Envelope Estimation for Low Rate Coding of Speech”, IEEE ICASSP-99, Phoenix, Arizona, 1999.
M.N. Murthi and B.D. Rao, “All-pole Modeling of Speech Based on the Minimum Variance Distortionless Response Spectrum”, IEEE Trans. Speech & Audio Processing, May 2000.
U.H. Yapanel and J.H.L. Hansen, “A New Perspective on Feature Extraction for Robust In-vehicle Speech Recognition”, Eurospeech-03, pp. 1281–1284, Geneva, Switzerland, Sept. 2003.
S. Dharanipragada and B.D. Rao, “MVDR-based Feature Extraction for Robust Speech Recognition”, IEEE ICASSP-01, Salt Lake City, Utah, 2001.
U.H. Yapanel and S. Dharanipragada, “Perceptual MVDR-based Cepstral Coefficients for Noise Robust Speech Recognition”, IEEE ICASSP-03, Hong Kong, China, April 2003.
U.H. Yapanel, S. Dharanipragada, J.H.L. Hansen, “Perceptual MVDR-based Cepstral Coefficients for High-accuracy Speech Recognition”, Eurospeech-03, pp. 1425–1428, Geneva, Switzerland, Sept. 2003.
S.L. Marple, Jr, “Digital Spectral Analysis with Applications”, Prentice-Hall, Englewood Cliffs, NJ, 1987
K. Tokuda, T. Masuko, T. Kobayashi, and S. Imai, “Mel-generalized Cepstral Analysis-A Unified Approach to Speech Spectral Estimation”, ICSLP-94, Yokohama, Japan, 1994.
L.F. Uebel and P.C. Woodland, “An Investigation into Vocal Tract Length Normalization”, Eurospeech-99, Budapest, Hungary, 1999.
J. McDonough, W. Byrne, and X. Luo, “Speaker Normalization with All-pass Transforms”, ICSLP-98, Sydney, Australia, 1998.
S.E. Bou-Ghazale, J.H.L. Hansen, “A Comparative Study of Traditional and Newly Proposed Features for Recognition of Speech Under Stress,” IEEE Trans. Speech & Audio Proc., 8(4):429–442, July 2000.
B. Pellom, W. Ward, J.H.L. Hansen, K. Hacioglu, J. Zhang, X. Yu, S. Pradhan, “University of Colorado Dialog Systems for Travel and Navigation”, in Human Language Technology Conference (HLT), San Diego, California, March, 2001.
URL: Galaxy Communicator Software, http://communicator.sourceforge.net
URL: University of Colorado SONIC LVCSR System http://cslr.colorado.edu/beginweb/speech_recognition/sonic.html
S. Seneff, E. Hurley, R. Lau, C. Pao, P. Schmid, V. Zue, “Galaxy-II: A Reference Architecture for Conversational System Development,” Proc. ICSLP, Sydney Australia, Vol. 3, pp. 931–934, 1998.
X. Zhang, J.H.L. Hansen, “CSA-BF: Novel Constrained Switched Adaptive Beamforming for Speech Enhancement & Recognition in Real Car Environments”, IEEE Trans. on Speech & Audio Processing, vol. 11, pp. 733–745, Nov. 2003.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer Science + Business Media, Inc.
About this chapter
Cite this chapter
Hansen, J.H. et al. (2005). CU-Move: Advanced In-Vehicle Speech Systems for Route Navigation. In: Abut, H., Hansen, J.H., Takeda, K. (eds) DSP for In-Vehicle and Mobile Systems. Springer, Boston, MA. https://doi.org/10.1007/0-387-22979-5_2
Download citation
DOI: https://doi.org/10.1007/0-387-22979-5_2
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-22978-2
Online ISBN: 978-0-387-22979-9
eBook Packages: EngineeringEngineering (R0)