Abstract
Text mining and data mining are contrasted relative to automated prediction. Models are constructed by training on samples of unstructured documents, and results are projected to new text. A standard data format for input to prediction methods is described. The key objective of data preparation is to transform text into a numerical format, eventually sharing a common representation with numerical data mining. Different text-mining tasks are introduced that fit within a predictive framework for machine-learning. These include document classification, information retrieval, clustering of documents, information extraction, and performance evaluation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2015 Springer-Verlag London
About this chapter
Cite this chapter
Weiss, S.M., Indurkhya, N., Zhang, T. (2015). Overview of Text Mining. In: Fundamentals of Predictive Text Mining. Texts in Computer Science. Springer, London. https://doi.org/10.1007/978-1-4471-6750-1_1
Download citation
DOI: https://doi.org/10.1007/978-1-4471-6750-1_1
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-4471-6749-5
Online ISBN: 978-1-4471-6750-1
eBook Packages: Computer ScienceComputer Science (R0)