Skip to main content

Stochastic Analysis of Structured Language Modeling

  • Conference paper
Mathematical Foundations of Speech and Language Processing

Part of the book series: The IMA Volumes in Mathematics and its Applications ((IMA,volume 138))

Abstract

As previously introduced, the Structured Language Model (SLM) operated with the help of a stack from which less probable sub-parse entries were purged before further words were generated. In this article we generalize the CKY algorithm to obtain a chart which allows the direct computation of language model probabilities thus rendering the stacks unnecessary. An analysis of the behavior of the SLM leads to a generalization of the Inside–Outside algorithm and thus to rigorous EM type re-estimation of the SLM parameters. The derived algorithms are computationally expensive but their demands can be mitigated by use of appropriate thresholding.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. C. Chelba and F. Jelinek, “Structured Language Modeling,” Computer Speech and Language, Vol. 14, No. 4, October 2000.

    Google Scholar 

  2. C. Chelba and F. Jelinek, “Exploiting Syntactic Structure for Language Modeling,” Proceedings of COL1NG-ACL, Vol. 1, pp. 225–231, Montreal, Canada, August 10-14, 1998.

    Google Scholar 

  3. M. Marcus and B. Santorini, “Building a Large Annotated Corpus of English: the Penn Treebank,” Computational Linguistics, Vol. 19, No. 2, pp. 313–330, June 1993.

    Google Scholar 

  4. J. Cocke, unpublished notes.

    Google Scholar 

  5. T. Kasami, “An efficient recognition and syntax algorithm for context-free languages,” Scientific Report AFCRL-65-758, Air Force Cambridge Research Lab., Bedford MA, 1965.

    Google Scholar 

  6. D.H. Younger, “Recognition and Parsing of Context Free Languages in Time N3,” Information and Control, Vol. 10, pp. 198–208, 1967.

    Article  Google Scholar 

  7. J.K. Baker, “Trainable Grammars for Speech Recognition,” Proceedings of the Spring Conference of the Acoustical Society of America, pp. 547–550, Boston MA, 1979.

    Google Scholar 

  8. A. Ratnaparkhi, “A Linear Observed Time Statistical Parser Based on Maximum Entropy Models,” Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, pp. 1–10, Providence, RI, 1997.

    Google Scholar 

  9. E. Charniak, “Treebank Grammars,” Proceedings of the Thirteenth National Conference on Artificial Intelligence, pp. 1031–1036, Menlo Park, CA, 1996.

    Google Scholar 

  10. M.J. Collins, “A New Statistical Parser Based on Bigram Lexical Dependencies,” Proceedings of the 34th Annual Meeting of the Associations for Computational Linguistics, pp. 184–191, Santa Cruz, CA, 1996.

    Chapter  Google Scholar 

  11. C. Chelba, “A Structured Language Model,” Proceedings of ACL/EACL’97 Student Session, pp. 498–500, Madrid, Spain, 1997.

    Google Scholar 

  12. C. Chelba and F. Jelinek, “Refinement of a Structured Language Model,” Proceedings of ICAPR-98, pp. 225–231, Plymouth, England, 1998

    Google Scholar 

  13. C. Chelba and F. Jelinek, “Structured Language Modeling for Speech Recognition,” Proceedings of NLDB99, Klagenfurt, Austria, 1999

    Google Scholar 

  14. C. Chelba and F. Jelinek, “Recognition Performance of a Structured Language Model,” Proceedings of Euro speech’99, Vol. 4, pp. 1567–1570, Budapest, Hungary, 1999.

    Google Scholar 

  15. F. Jelinek and C. Chelba, “Putting Language into Language Modeling,” Proceedings of Eurospeech’99, Vol. 1, pp. KN–1–6, Budapest, Hungary, 1999.

    Google Scholar 

  16. C. Chelba and P. Xu, “Richer Syntactic Dependencies for Structured Language Modeling,” Proceedings of the Automatic Speech Recognition and Understanding Workshop, Madonna di Campiglio, Italy, 2001.

    Google Scholar 

  17. P. Xu, C. Chelba, and F. Jelinek, “A Study on Richer Syntactic Dependencies for Structured Language Modeling,” Proceedings of ACL’ 02, pp. 191–198, Philadelphia, 2002.

    Google Scholar 

  18. D.H. Van Uystel, D. Van Compernolle, and P. Wambacq, “NaximumLikelihood Training of the PLCG-Based Language Model,” Proceedings of the Automatic Speech Recognition and Understanding Workshop, Madonna di Campiglio, Italy, 2001.

    Google Scholar 

  19. D.H. Van Uystel, F. Van Aelten, and D. Van Compernolle, “A Structured Language Model Based on Context-Sensitive Probabilistic Left-Corner Parsing,” Proceedings of 2nd Meeting of the North American Chapter of the ACL, pp. 223–230, Pittsburgh, 2001.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer Science+Business Media New York

About this paper

Cite this paper

Jelinek, F. (2004). Stochastic Analysis of Structured Language Modeling. In: Johnson, M., Khudanpur, S.P., Ostendorf, M., Rosenfeld, R. (eds) Mathematical Foundations of Speech and Language Processing. The IMA Volumes in Mathematics and its Applications, vol 138. Springer, New York, NY. https://doi.org/10.1007/978-1-4419-9017-4_3

Download citation

  • DOI: https://doi.org/10.1007/978-1-4419-9017-4_3

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4612-6484-2

  • Online ISBN: 978-1-4419-9017-4

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics