Skip to main content

Part of the book series: The IMA Volumes in Mathematics and its Applications ((IMA,volume 80))

  • 197 Accesses

Abstract

Traditionally, parsing of text is based on an explicit grammar and an associated parsing procedure. Examples of grammars are Context Free, Context Sensitive, Transformational, etc. The grammars are specified in a generative mode, A parsing procedure is then designed for the grammar in question (e.g. LR parsing, CYK parsing, Early parsing, etc) and is supposed to reverse the process: given text, find the particular generative sequence whose result was the text.

Parsed text is useful in text understanding or in language translation. In most cases it consists of a tree with labeled nodes and individual words at the leaves of the tree. Understanding systems attempt to derive meaning from operations on the structure of the tree. Machine translators frequently accomplish their task by transforming the tree of the source language into a tree of the target language. There are two major problems with the traditional procedure: a grammar has to be designed, usually by hand, and corresponding text analysis yields highly ambiguous parses. For some time now, attempts have been made to extract the grammar automatically from data, attach probabilities to its productions, and resolve the parsing ambiguity by selecting the most probable parse. The grammar extraction process has been based on TREEBANKS which are data bases consisting of large amounts of parsed text.

Cooperating researchers at IBM and the University of Pennsylvania have recently realized that since one is interested in parsing and not in generation, one might as well develop parsers directly, without recourse to the painful process of grammar development. Two separate and promising approaches have emerged, one statistical, one rule-based. This talk will describe both, and point out their differences and affinities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Springer-Verlag New York, Inc.

About this paper

Cite this paper

Jelinek, F. (1996). Direct Parsing of Text. In: Levinson, S.E., Shepp, L. (eds) Image Models (and their Speech Model Cousins). The IMA Volumes in Mathematics and its Applications, vol 80. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-4056-3_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4612-4056-3_4

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4612-8482-6

  • Online ISBN: 978-1-4612-4056-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics