Skip to main content

Document Classification and Information Extraction

  • Chapter
  • 31 Accesses

Abstract

In Chapter 4 and 5, we turn our attention to the techniques used for document classification and information extraction [60, 61, 62, 174, 175]. In TEXPROS, the task of document classification is to determine the types of the office documents. That is, given an office document, the document classification subsystem identifies the corresponding frame template of the document. By identifying the defined type of the documents, it is possible to implement efficient storage and access methods to enhance the performance of retrieval. The task of information extraction is extracting from the contents of the document the most relevant information pertinent to the user. That is, given an office document, the information extraction subsystem forms its frame instance by instantiating its corresponding frame template. The document classification and information extraction can be achieved in aid of analyzing the document structures.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 1996 Kluwer Academic Publishers

About this chapter

Cite this chapter

Liu, Q., Ng, P.A. (1996). Document Classification and Information Extraction. In: Document Processing and Retrieval. Springer, Boston, MA. https://doi.org/10.1007/978-1-4613-1295-6_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4613-1295-6_4

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4612-8554-0

  • Online ISBN: 978-1-4613-1295-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics