Skip to main content

The Assignment of Subject Descriptors to Magazine Articles

  • Chapter
  • 207 Accesses

Part of the book series: The Information Retrieval Series ((INRE,volume 6))

Conclusions

Successful systems that classify texts and assign subject or classification codes rely upon the words and phrases of the texts. In many text categorization situations the number of patterns is large to manually acquire. In this case, the classifier is trained upon example texts. We investigated three aspects of text classifiers when categorizing magazine articles with broad subject descriptors: feature selection, learning algorithms, and improvement of the quality of the learned classifier by selection and grouping of the examples. Because the subject descriptors regard the broad topics of the texts, an initial feature selection that identifies the topic terms is important. Selecting important content words and proper names based upon the term frequency that is normalized by the maximum number a content term occurs in the text is effective. Adding knowledge of the discourse structure in the term selection process is useful for certain text classes. Given the limited number of positive examples and the high number of text features in the articles that belong to a variety of magazines, columns, and subject domains, the results of training a text classifier with the χ2 algorithm are very satisfying.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   229.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   299.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Kluwer Academic Publishers

About this chapter

Cite this chapter

(2002). The Assignment of Subject Descriptors to Magazine Articles. In: Automatic Indexing and Abstracting of Document Texts. The Information Retrieval Series, vol 6. Springer, Boston, MA. https://doi.org/10.1007/0-306-47017-9_10

Download citation

  • DOI: https://doi.org/10.1007/0-306-47017-9_10

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-7923-7793-1

  • Online ISBN: 978-0-306-47017-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics