Abstract
This chapter looks at a particular type of classification task, where the objects are text documents. A method of processing the documents for use by the classification algorithms given earlier in this book using a bag-of-words representation is described.
An important special case of text classification arises when the documents are web pages. The automatic classification of web pages is known as hypertext categorisation. The differences between standard text classification and hypertext categorisation are illustrated and issues relating to the latter are discussed.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag London
About this chapter
Cite this chapter
Bramer, M. (2013). Text Mining. In: Principles of Data Mining. Undergraduate Topics in Computer Science. Springer, London. https://doi.org/10.1007/978-1-4471-4884-5_20
Download citation
DOI: https://doi.org/10.1007/978-1-4471-4884-5_20
Publisher Name: Springer, London
Print ISBN: 978-1-4471-4883-8
Online ISBN: 978-1-4471-4884-5
eBook Packages: Computer ScienceComputer Science (R0)