Skip to main content

Using Natural-Language Knowledge Sources in Speech Recognition

  • Chapter

Part of the book series: NATO ASI Series ((NATO ASI F,volume 169))

Summary

High accuracy speech recognition requires a language model, to specify what word sequences are possible or at least likely. Standard n-gram language models for speech recognition ignore linguistic structures, but more linguistically sophisticated language models are possible. Unification grammars are widely used in natural languageand these can be compiled into non-left-recursive context-free grammars that can then be used in realtime speech recognizers by dynamically expanding them into state-transition networks. A hybrid language model incorporating both a unification grammar and n-gram statistics has been shown to increase speech recognition accuracy. Probabilistic context-free grammars and probabilistic unification grammars are also possible.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Alshawi, H. (ed.) (1992) The Core Language Engine, The MIT Press, Cambridge, Massachusetts.

    Google Scholar 

  • Black, E., F. Jelinek, J. Lafferty, D. M. Magerman, R. Mercer, and S. Roukos (1993) “Towards History-based Grammars: Using Richer Models for Probabilistic Parsing,” in Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, pp. 31–37.

    Google Scholar 

  • Boisen, S., Y.-L. Chow, A. Haas, R. Ingria, S. Roukos, and D. Stallard (1989) “The BBN Spoken Language System,” in Proceedings Speech and Natural Language Workshop February 1989, Philadelphia, Pennsylvania, pp. 106–111 ( Morgan Kaufmann Publishers, Inc., San Mateo, California ).

    Google Scholar 

  • Briscoe, T., and J. Carroll (1993) “Generalized Probabilistic LR Parsing of Natural Language (Corpora) with Unification-Based Grammars,” Computational Linguistics, Vol. 19, No. 1, pp. 25–59.

    Google Scholar 

  • Charniak, E. (1995) “Parsing with Context-Free Grammars and Word Statistics,” Technical Report CS-95-28, Department of Computer Science, Brown University, Providence, Rhode Island.

    Google Scholar 

  • Chomsky, N. (1957) Syntactic Structures, Mouton & Co., The Hague, Holland.

    Google Scholar 

  • Chow, Y.-L., and S. Roukos (1989) “Speech Understanding Using a Unification Grammar,” in Proceedings of the IEEE Conference on Acoustics, Speech, and Signal Processing, Glasgow, Scotland, pp. 727–730.

    Google Scholar 

  • Chow, Y.-L., and R. Schwartz (1989) “The N-Best Algorithm: An Efficient Procedure for Finding Top N Sentence Hypotheses,” in Proceedings Speech and Natural Language Workshop October 1989, Cape Cod, Massachusetts, pp. 199–202 (Morgan Kaufmann Publishers, Inc., San Mateo, California ).

    Google Scholar 

  • Church, K. W. (1989) “Syntactic Parsing May Not Help Speech Recognition Very Much,” in Spoken Language Systems Working Notes, AAAI Spring Symposium Series, Stanford, California, pp. 6–9.

    Google Scholar 

  • Culy, C. (1985) “The Complexity of the Vocabulary of Bambara,” Linguistics and Philosophy, Vol. 8, No. 3, pp. 345–351.

    Article  Google Scholar 

  • Collins, M. J. (1996) “A New Statistical Parser Based on Bigram Lexical Dependencies,” in Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, California, pp. 184–191.

    Google Scholar 

  • Collins, M. J. (1997) “Three Generative, Lexicalized Models for Statistical Parsing,” in Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics, Madrid, Spain, pp. 16–23.

    Google Scholar 

  • Dowding, J., J. M. Gawron, D. Appelt, J. Bear, L. Chemy, R. Moore, and D. Moran (1993) “Gemini: A Natural Language System for Spoken-Language Understanding,” in Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics, Columbus, Ohio, pp. 54–61.

    Google Scholar 

  • Dowding, J., R. Moore, F. Andry, and D. Moran (1994) “Interleaving Syntax and Semantics in an Efficient Bottom-Up Parser,” in Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, New Mexico, pp. 110–116.

    Google Scholar 

  • Dupont, P. (1993) “Dynamic Use of Syntactical Knowledge in Continuous Speech Recognition,” in Proceedings Third European Conference on Speech Communication and Technology, Berlin, Germany, pp. 1959–1962.

    Google Scholar 

  • Dupont, P., and D. Snyers (1989) “Efficient Dynamic Expansion of Context-Free Grammar in Speech Recognition,” in Overview of Research in Speech Recognition at PRLB in 1988, Philips Research Laboratory, Brussels, Belgium, pp. 32–68.

    Google Scholar 

  • Goodman J. (1997) “Probabilistic Feature Grammars,” in Proceedings of the International Workshop on Parsing Technologies, Boston, Massachusetts.

    Google Scholar 

  • Grishman, R., and J. Sterling (1993) “Smoothing of Automatically Generated Selectional Constraints,” in Proceedings Human Language Technology Workshop, Plainsboro, New Jersey, pp. 254–259 ( Morgan Kaufmann Publishers, Inc., San Francisco, California ).

    Google Scholar 

  • Hopcroft, J., and J. Ullman (1979) Introduction to Automata Theory, Languages, and Computationi, Addison Wesley Publishing Company, Reading, Massachusetts.

    Google Scholar 

  • Huang, X., A. Acero, F. Alleva, M.-Y. Hwang, L. Jiang, and M. Mahajan (1995) “Microsoft Windows Highly Intelligent Speech Recognizer: Whisper,” in Proceedings 1995 International Conference on Acoustics, Speech, and Signal Processing, Detroit, Michigan, pp. 93–96.

    Google Scholar 

  • Lafferty, J., D. Sleator, and D. Temperley (1992) “Grammatical Trigrams: A Probabilistic Model of Link Grammar,” Probabilistic Approaches to Natural Language Working Notes, AAAI Fall Symposium Series, Cambridge, Massachusetts, pp. 89–97.

    Google Scholar 

  • Magerman, D. M. (1995) “Statistical Decision-Tree Models for Parsing,” in Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, Cambridge, Massachusetts, pp. 276–283.

    Google Scholar 

  • Marcus, M. P., B. Santorini, and M. A. Marcinkiewicz (1993) “Building a Large Annotated Corpus of English: The Penn Treebank,” Computational Linguistics, Vol. 19, No. 2, pp. 313–330.

    Google Scholar 

  • Mohri, M. (1997) “Finite-State Transducers in Language and Speech Processing,” Computational Linguistics, Vol. 23, No. 2, pp. 269–311.

    MathSciNet  Google Scholar 

  • Moore, R. (1995) Logic and Representation, CSLI Publications, Center for the Study of Language and Information, Stanford University, Stanford, California.

    MATH  Google Scholar 

  • Moore, R., M. Cohen, V. Abrash, D. Appelt, H. Bratt, J. Butzberger, L. Cherny, J. Dowding, H. Franco, J. M. Gawron, and D. Moran (1994) “SRI’s Recent Progress on the ATIS Task,” in Proceedings of the Spoken Language Technology Workshop, Plainsboro, New Jersey, pp. 72–75 ( Morgan Kaufmann Publishers, Inc., San Francisco, California ).

    Google Scholar 

  • Moore, R., D. Appelt, J. Dowding, J. M. Gawron, and D. Moran (1995) “Combining Linguistic and Statistical Knowledge Sources in Natural-Language Processing for ATIS,” in Proceedings of the Spoken Language Systems Technology Workshop, Austin, Texas, pp. 261–264 (Morgan Kaufmann Publishers, Inc., San Francisco, California).

    Google Scholar 

  • Moore R., J. Dowding, H. Bratt, J. M. Gawron, Y. Gorfu, and A. Cheyer (1997) “Com-mandTalk: A Spoken-Language Interface for Battlefield Simulations,” in Proceedings of the Fifth Conference on Applied Natural Language Processing, Association for Computational Linguistics, Washington, DC, pp. 1–7.

    Google Scholar 

  • Moore, R., F. Pereira, and H. Murveit (1989) “Integrating Speech and Natural-Language Processing,” in Proceedings Speech and Natural Language Workshop February 1989, Philadelphia, Pennsylvania, pp. 243–247 ( Morgan Kaufmann Publishers, Inc., San Mateo, California ).

    Google Scholar 

  • Murveit, H., and R. Moore (1990) “Integrating Natural Language Constraints into HMM-based Speech Recognition,” in Proceedings 1990 International Conference on Acoustics, Speech, and Signal Processing, Albuquerque, New Mexico, pp. 573–576.

    Google Scholar 

  • Nuance Communications (1996) Nuance Speech Recognition System, Version 5, Developer’s Manual, Menlo Park, California.

    Google Scholar 

  • Pullum, G. K., and G. Gazdar (1982) “Natural Languages and Context-Free Languages,” Linguistics and Philosophy, Vol. 4, No. 4, pp. 471–504.

    Article  Google Scholar 

  • Schwartz, R., and S. Austin (1990) “Efficient, High-Performance Algorithms for N-Best Search,” in Proceedings Speech and Natural Language Workshop June 1990, Hidden Valley, Pennsylvania, pp. 6–11 ( Morgan Kaufmann Publishers, Inc., San Mateo, California ).

    Google Scholar 

  • Shieber, S. M. (1985) “Evidence Against the Context-Freeness of Natural Language,” Linguistics and Philosophy, Vol. 8, No. 3, pp. 333–343.

    Article  Google Scholar 

  • Woods, W. A. (1970) “Transition Network Grammars for Natural Language Analysis,” Communications of the ACM, Vol. 13, No. 10, pp. 591–606.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Moore, R.C. (1999). Using Natural-Language Knowledge Sources in Speech Recognition. In: Ponting, K. (eds) Computational Models of Speech Pattern Processing. NATO ASI Series, vol 169. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-60087-6_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-60087-6_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-64250-0

  • Online ISBN: 978-3-642-60087-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics