Summary
High accuracy speech recognition requires a language model, to specify what word sequences are possible or at least likely. Standard n-gram language models for speech recognition ignore linguistic structures, but more linguistically sophisticated language models are possible. Unification grammars are widely used in natural languageand these can be compiled into non-left-recursive context-free grammars that can then be used in realtime speech recognizers by dynamically expanding them into state-transition networks. A hybrid language model incorporating both a unification grammar and n-gram statistics has been shown to increase speech recognition accuracy. Probabilistic context-free grammars and probabilistic unification grammars are also possible.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Alshawi, H. (ed.) (1992) The Core Language Engine, The MIT Press, Cambridge, Massachusetts.
Black, E., F. Jelinek, J. Lafferty, D. M. Magerman, R. Mercer, and S. Roukos (1993) “Towards History-based Grammars: Using Richer Models for Probabilistic Parsing,” in Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, pp. 31–37.
Boisen, S., Y.-L. Chow, A. Haas, R. Ingria, S. Roukos, and D. Stallard (1989) “The BBN Spoken Language System,” in Proceedings Speech and Natural Language Workshop February 1989, Philadelphia, Pennsylvania, pp. 106–111 ( Morgan Kaufmann Publishers, Inc., San Mateo, California ).
Briscoe, T., and J. Carroll (1993) “Generalized Probabilistic LR Parsing of Natural Language (Corpora) with Unification-Based Grammars,” Computational Linguistics, Vol. 19, No. 1, pp. 25–59.
Charniak, E. (1995) “Parsing with Context-Free Grammars and Word Statistics,” Technical Report CS-95-28, Department of Computer Science, Brown University, Providence, Rhode Island.
Chomsky, N. (1957) Syntactic Structures, Mouton & Co., The Hague, Holland.
Chow, Y.-L., and S. Roukos (1989) “Speech Understanding Using a Unification Grammar,” in Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing, Glasgow, Scotland, pp. 727–730.
Chow, Y.-L., and R. Schwartz (1989) “The N-Best Algorithm: An Efficient Procedure for Finding Top N Sentence Hypotheses,” in Proceedings Speech and Natural Language Workshop October 1989, Cape Cod, Massachusetts, pp. 199–202 (Morgan Kaufmann Publishers, Inc., San Mateo, California ).
Church, K. W. (1989) “Syntactic Parsing May Not Help Speech Recognition Very Much,” in Spoken Language Systems Working Notes, AAAI Spring Symposium Series, Stanford, California, pp. 6–9.
Culy, C. (1985) “The Complexity of the Vocabulary of Bambara,” Linguistics and Philosophy, Vol. 8, No. 3, pp. 345–351.
Collins, M. J. (1996) “A New Statistical Parser Based on Bigram Lexical Dependencies,” in Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, California, pp. 184–191.
Collins, M. J. (1997) “Three Generative, Lexicalized Models for Statistical Parsing,” in Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, Madrid, Spain, pp. 16–23.
Dowding, J., J. M. Gawron, D. Appelt, J. Bear, L. Chemy, R. Moore, and D. Moran (1993) “Gemini: A Natural Language System for Spoken-Language Understanding,” in Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, pp. 54–61.
Dowding, J., R. Moore, F. Andry, and D. Moran (1994) “Interleaving Syntax and Semantics in an Efficient Bottom-Up Parser,” in Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, New Mexico, pp. 110–116.
Dupont, P. (1993) “Dynamic Use of Syntactical Knowledge in Continuous Speech Recognition,” in Proceedings Third European Conference on Speech Communication and Technology, Berlin, Germany, pp. 1959–1962.
Dupont, P., and D. Snyers (1989) “Efficient Dynamic Expansion of Context-Free Grammar in Speech Recognition,” in Overview of Research in Speech Recognition at PRLB in 1988, Philips Research Laboratory, Brussels, Belgium, pp. 32–68.
Goodman J. (1997) “Probabilistic Feature Grammars,” in Proceedings of the International Workshop on Parsing Technologies, Boston, Massachusetts.
Grishman, R., and J. Sterling (1993) “Smoothing of Automatically Generated Selectional Constraints,” in Proceedings Human Language Technology Workshop, Plainsboro, New Jersey, pp. 254–259 ( Morgan Kaufmann Publishers, Inc., San Francisco, California ).
Hopcroft, J., and J. Ullman (1979) Introduction to Automata Theory, Languages, and Computationi, Addison Wesley Publishing Company, Reading, Massachusetts.
Huang, X., A. Acero, F. Alleva, M.-Y. Hwang, L. Jiang, and M. Mahajan (1995) “Microsoft Windows Highly Intelligent Speech Recognizer: Whisper,” in Proceedings 1995 International Conference on Acoustics, Speech, and Signal Processing, Detroit, Michigan, pp. 93–96.
Lafferty, J., D. Sleator, and D. Temperley (1992) “Grammatical Trigrams: A Probabilistic Model of Link Grammar,” Probabilistic Approaches to Natural Language Working Notes, AAAI Fall Symposium Series, Cambridge, Massachusetts, pp. 89–97.
Magerman, D. M. (1995) “Statistical Decision-Tree Models for Parsing,” in Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, Cambridge, Massachusetts, pp. 276–283.
Marcus, M. P., B. Santorini, and M. A. Marcinkiewicz (1993) “Building a Large Annotated Corpus of English: The Penn Treebank,” Computational Linguistics, Vol. 19, No. 2, pp. 313–330.
Mohri, M. (1997) “Finite-State Transducers in Language and Speech Processing,” Computational Linguistics, Vol. 23, No. 2, pp. 269–311.
Moore, R. (1995) Logic and Representation, CSLI Publications, Center for the Study of Language and Information, Stanford University, Stanford, California.
Moore, R., M. Cohen, V. Abrash, D. Appelt, H. Bratt, J. Butzberger, L. Cherny, J. Dowding, H. Franco, J. M. Gawron, and D. Moran (1994) “SRI’s Recent Progress on the ATIS Task,” in Proceedings of the Spoken Language Technology Workshop, Plainsboro, New Jersey, pp. 72–75 ( Morgan Kaufmann Publishers, Inc., San Francisco, California ).
Moore, R., D. Appelt, J. Dowding, J. M. Gawron, and D. Moran (1995) “Combining Linguistic and Statistical Knowledge Sources in Natural-Language Processing for ATIS,” in Proceedings of the Spoken Language Systems Technology Workshop, Austin, Texas, pp. 261–264 (Morgan Kaufmann Publishers, Inc., San Francisco, California).
Moore R., J. Dowding, H. Bratt, J. M. Gawron, Y. Gorfu, and A. Cheyer (1997) “Com-mandTalk: A Spoken-Language Interface for Battlefield Simulations,” in Proceedings of the Fifth Conference on Applied Natural Language Processing, Association for Computational Linguistics, Washington, DC, pp. 1–7.
Moore, R., F. Pereira, and H. Murveit (1989) “Integrating Speech and Natural-Language Processing,” in Proceedings Speech and Natural Language Workshop February 1989, Philadelphia, Pennsylvania, pp. 243–247 ( Morgan Kaufmann Publishers, Inc., San Mateo, California ).
Murveit, H., and R. Moore (1990) “Integrating Natural Language Constraints into HMM-based Speech Recognition,” in Proceedings 1990 International Conference on Acoustics, Speech, and Signal Processing, Albuquerque, New Mexico, pp. 573–576.
Nuance Communications (1996) Nuance Speech Recognition System, Version 5, Developer’s Manual, Menlo Park, California.
Pullum, G. K., and G. Gazdar (1982) “Natural Languages and Context-Free Languages,” Linguistics and Philosophy, Vol. 4, No. 4, pp. 471–504.
Schwartz, R., and S. Austin (1990) “Efficient, High-Performance Algorithms for N-Best Search,” in Proceedings Speech and Natural Language Workshop June 1990, Hidden Valley, Pennsylvania, pp. 6–11 ( Morgan Kaufmann Publishers, Inc., San Mateo, California ).
Shieber, S. M. (1985) “Evidence Against the Context-Freeness of Natural Language,” Linguistics and Philosophy, Vol. 8, No. 3, pp. 333–343.
Woods, W. A. (1970) “Transition Network Grammars for Natural Language Analysis,” Communications of the ACM, Vol. 13, No. 10, pp. 591–606.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Moore, R.C. (1999). Using Natural-Language Knowledge Sources in Speech Recognition. In: Ponting, K. (eds) Computational Models of Speech Pattern Processing. NATO ASI Series, vol 169. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-60087-6_27
Download citation
DOI: https://doi.org/10.1007/978-3-642-60087-6_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-64250-0
Online ISBN: 978-3-642-60087-6
eBook Packages: Springer Book Archive