The Assignment of Subject Descriptors to Magazine Articles

doi:10.1007/0-306-47017-9_10

The Assignment of Subject Descriptors to Magazine Articles

Chapter

207 Accesses

Part of the book series: The Information Retrieval Series ((INRE,volume 6))

Conclusions

Successful systems that classify texts and assign subject or classification codes rely upon the words and phrases of the texts. In many text categorization situations the number of patterns is large to manually acquire. In this case, the classifier is trained upon example texts. We investigated three aspects of text classifiers when categorizing magazine articles with broad subject descriptors: feature selection, learning algorithms, and improvement of the quality of the learned classifier by selection and grouping of the examples. Because the subject descriptors regard the broad topics of the texts, an initial feature selection that identifies the topic terms is important. Selecting important content words and proper names based upon the term frequency that is normalized by the maximum number a content term occurs in the text is effective. Adding knowledge of the discourse structure in the term selection process is useful for certain text classes. Given the limited number of positive examples and the high number of text features in the articles that belong to a variety of magazines, columns, and subject domains, the results of training a text classifier with the χ² algorithm are very satisfying.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 229.00; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Hardcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

(2002). The Assignment of Subject Descriptors to Magazine Articles. In: Automatic Indexing and Abstracting of Document Texts. The Information Retrieval Series, vol 6. Springer, Boston, MA. https://doi.org/10.1007/0-306-47017-9_10

Download citation

DOI: https://doi.org/10.1007/0-306-47017-9_10
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-7923-7793-1
Online ISBN: 978-0-306-47017-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Conclusions

Buying options