From Cost Sensitive Embedded Applications to PC-based Systems

doi:10.1007/0-306-47027-6_4

Part of the book series: The International Series in Engineering and Computer Science ((SECS,volume 563))

163 Accesses

Summary

For both embedded and PC-based systems, the cost of speech recognition has to be negligible in comparison to the total system cost. However, this is much easier to achieve for PCs. The processing power of the average PC has increased to the point that it is now possible to deliver software-only solutions that do not require any additional hardware. In this chapter, we review techniques that are especially useful in embedded applications or to decrease computational complexity such as floating-point to fixed-point conversion and fast Gaussian computation/Gaussian clustering which helps decreasing the amount of time spent in the likelihood computation. Then, we present some case studies for both embedded and PC-based systems. Finally, we focus on some standard Application Programming Interfaces (APIs) that are helping the transition from research prototypes to commercial voice-enabled applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Barnard, E., Halberstadt, A., Kotelly, C., and Phillips, M., (1999). A Consistent Approach to Designing Spoken-Dialog Systems. ASRU Workshop, Keystone, Colorado, U.S.A.
Google Scholar
Bocchieri, E., (1993). Vector Quantization for Efficient Computation of Continuous Density Likelihoods. ICASSP, pages 692–695.
Google Scholar
Boman, R., (1997). Fixed Point Implementation of Common Signal Processing Algorithms. ICSPAT, pages 716–720.
Google Scholar
Boman, R., (1999). Integer Implementation of a Perceptual Based Acoustic Front-End for Robust Speech Recognition in Additive and Convolutional Noise. ICSPAT.
Google Scholar
Boves, L. and den Os, E., (1999). Applications of Speech Technology: Designing for Usability. ASRU Workshop, Keystone, Colorado, U.S.A.
Google Scholar
Cole, R., Roginski, K., and Fanty, M., (1992). A Telephone Speech Database of Spelled and Spoken Names. ICSLP, pages 891–893.
Google Scholar
den Os, E.A. and Bloothooft, 1998. Evaluating Various Spoken Dialogue Systems with a Single Questionnaire: Analysis of the ELSNET Olympics. First Inter. Conf. On Language Resources and Evaluation, Granada, Spain, pages 51–54.
Google Scholar
Dobler, S., (1999). Speech Control in the Mobile Communications Environment. ASRU Workshop, Keystone, Colorado, U.S.A.
Google Scholar
Dobrin, C., Boda, P., and Laurila, K., (1999). On Usability of Name Dialing. ASRU Workshop, Keystone, Colorado, U.S.A.
Google Scholar
D’Orta P., Ferretti M., Scarci S., (1987). Phoneme Classification for Real Time Speech Recognition of Italian. ICASSP, pages 81–84.
Google Scholar
Elvira, J-M., Torecilla, J-C., and Caminero, J., (1997). Creating User Defined New Vocabularies for Voice Dialing. EUROSPEECH, pages 2463–2466.
Google Scholar
Fischer, A. and Stahl, V., (1998). Subword Unit Based Speech Recognition in Car Environments. ICASSP, pages 257–260.
Google Scholar
Gaddy, L., (1999). Embedded Engines Bring Speech to Consumer Appliances. Speech Technology Magazine, December 1999/January 2000, pages 36–39.
Google Scholar
Hataoka, N., Kokubo, H., Obuchi, Y., and Amano, A., (1998). Development of Robust Speech Recognition Middleware on Microprocessor. ICASSP, pages 837–840.
Google Scholar
Hunt, M., (1999). Some Experience in In-Car Speech Recognition. COST 249 and IEEE Workshop on Robust Methods for Speech Recognition in Adverse Conditions, Tampere, Finland, pages 25–31.
Google Scholar
Hwang, M.Y. and Huang, X., (1993). Shared Distribution Hidden Markov Models for Speech Recognition. IEEE Trans. on SAP, Vol. 1, pages 414–420.
Google Scholar
Jouvet D., Mauuary L., Monné J., (1991). Automatic Adjustments of the Structure of Markov Models for Speech Recognition Applications. EUROSPEECH, pages 927–930.
Google Scholar
Kao, Y-K.,, Anderson, W., and Lim, H-S., (1997). A Multi-lingual, Speaker-Independent, Continuous-Speech Recognizer on TMS320C5x Fixed-point DSP. ICSPAT, pages 1639–1643.
Google Scholar
Kao, Y-K., (1998a). Minimization of Search Network in Speech Recognition. ICSPAT, pages 1344–1348.
Google Scholar
Kao, Y-K., (1998b). N-Best Search Algorithm for Continuous Speech Recognition. ICSPAT, pages 1349–1353.
Google Scholar
Knill, K.M., Gales, M.J.F., and Young, S.J., (1996). Use of Gaussian Selection in Large Vocabulary Continuous Speech Recognition Using HMMs. ICSLP, pages 470–473.
Google Scholar
Labrosse, J. J., (1998). Fixed-point Arithmetic for Embedded Systems. C/C++ Users Journal, pages 21–28.
Google Scholar
Lim, H-W., (1998). Implementing Speech Recognition Algorithms on the TMS320C2xx Platform. Application Report, Digital Signal Processing Solutions. Texas Instruments.
Google Scholar
Mak B., Bocchieri E., Barnard E., (1997). Stream Derivation and Clustering Schemes for Subspace Distribution Clustering HMM. IEEE ASRU Workshop, Santa Barbara, U.S.A., pages 339–341.
Google Scholar
Margulies, E., (1997). Understanding JAVA Telephony. Flatiron Publishing, Inc., New York.
Google Scholar
Microsoft, 1998. Microsoft Speech API 4.0.
Google Scholar
Motorola, (1997). Scalable Language API. Version 0.82, May 13^th.
Google Scholar
Muthusamy, Y., Agarwal, R., Gong, Y., and Viswanathan, V., (1999). Speech-Enabled Information Retrieval in the Automobile Environment. ICASSP, pages 2259–2262.
Google Scholar
Padmanabhan, M., Bahl, L.R., Nahamoo, D., and de Souza, P., (1997). Decision-Tree Quantization of the Feature Space of a Speech Recognizer. EUROSPEECH, pages 147–150.
Google Scholar
Paul, D.B., (1999). An Investigation of Gaussian Shortlists. ASRU Workshop, Keystone, Colorado, U.S.A.
Google Scholar
Pouteau, X., Krahmer, E., and Landsbergen, J., (1997). Robust Spoken Dialogue Management for Driver Information Systems. EUROSPEECH, pages 2207–2210.
Google Scholar
Pouteau, X. and Arévalo, L., (1998). Robust Spoken Dialogue Systems for Consumer Products: A Concrete Application. ICSLP, pages 1231–1234.
Google Scholar
Ramalingam, C.S., Gong, Y., Netsch, L.P., Anderson, W.W., Godfrey, J.J., Kao, Y-H., (1999). Speaker-Dependent Name Dialing in a Car Environment with Out-of-Vocabulary Rejection. ICASSP, pages 165–168.
Google Scholar
Simonin, J., Delphin-Poulat, L., and Damnati, G., (1998). Gaussian Density Tree Structure in a Multi-Gaussian HMM-Bused Speech Recognition System. ICSLP, pages 2939–2942.
Google Scholar
Sun Microsystems, (1998). Java™ Speech API Programmer Guide. Version 1.0, October 26
Google Scholar
Tan, T.T., Gu, Y., and Thomas T., (1999). Word Confusability Measures for Vocabulary Selection in Speech Recognition. ASRU Workshop, Keystone, Colorado, U.S.A.
Google Scholar
van den Heuvel, H., Bonafonte, A., Boudy, J., Dufour, S., Lockwood, P., Moreno, A., Richard, G., (1999). SpeechDat-Car: Towards a Collection of Speech Databases for Automotive Environments. COST 249 and IEEE Workshop on Robust Methods for Speech Recognition in Adverse Conditions, Tampere, Finland, pages 135–138.
Google Scholar
VoiceXML Forum, (1999). Voice Extensible Markup Language, Version 0.9. http://www.voicexml.org.
Watanabe, T., Shinoda, K., Takagi, K., and Iso, K.I., (1995). High Speed Speech Recognition Using Tree-Structured Probability Density Function. ICASSP, pages 556–559.
Google Scholar
Westphal, M. and Waibel, A., (1999). Towards Spontaneous Speech Recognition for On-Board Car Navigation and Inforination Systems. EUROSPEECH, pages 1955–1958.
Google Scholar

Download references

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

(2002). From Cost Sensitive Embedded Applications to PC-based Systems. In: Robust Speech Recognition in Embedded Systems and PC Applications. The International Series in Engineering and Computer Science, vol 563. Springer, Boston, MA. https://doi.org/10.1007/0-306-47027-6_4

Download citation

DOI: https://doi.org/10.1007/0-306-47027-6_4
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-7923-7873-0
Online ISBN: 978-0-306-47027-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics