Skip to main content

Text Mining

  • Reference work entry
  • First Online:
  • 174 Accesses

Synonyms

Knowledge discovery in text (KDT)

Definition

Text mining is the art of data mining from text data collections. The goal is to discover knowledge (or information, patterns) from text data, which are unstructured or semi-structured. It is a subfield of Data Mining (DM), which is also known as Knowledge Discovery in Databases (KDD). KDD is to discover knowledge from various data sources, including text data, relational databases, Web data, user log data, etc. Text Mining is also related to other research fields, including Machine Learning (ML), Information Retrieval (IR), Natural Language Processing (NLP), Information Extraction (IE), Statistics, Pattern Recognition (PR), Artificial Intelligence (AI), etc.

Historical Background

The phrase of Knowledge Discovery in Databases (KDD) was first used at 1st KDD workshop in 1989. Marti Hearst [4] first used the term of text data mining (TDM) and differentiated it with other concepts such as information retrieval and natural language...

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   4,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   6,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Recommended Reading

  1. Andreas H, Andreas N, Gerhard P. A brief survey of text mining. J Computat Linguistics Lang Technol. 2005;20(1):19–62.

    Google Scholar 

  2. Bing L. Web data mining: exploring hyperlinks contents and usage data. Berlin: Springer; 2007. p. 411–47.

    Google Scholar 

  3. Dipanjan D, Martins AFT. A survey on automatic text summarization. Literature survey for the language and statistics II course at Carnegie Mellon University; November. 2007.

    Google Scholar 

  4. Hearst M Untangling text data mining. In: Proceedings of the 27th Annual Meeting of the Associate for Computational Linguistics; 1999.

    Google Scholar 

  5. Informative and indicative summarization. Available at: http://www1.cs.columbia.edu/~min/papers/sigirDuc01/node2.html.

  6. Liebman M. Bioinformatics: an editorial perspective. Available at: http://www.netsci.org/Science/Bioinform/feature01.html.

  7. Usama F, Gregory P-S, Padhraic S. From data mining to knowledge discovery in databases. AI Mag. 1996;17(3):37–54.

    Google Scholar 

  8. Wayne CL. Multilingual topic detection and tracking: successful research enabled by corpora and evaluation. In: Proceedings of the 27th Annual Meeting of the Associate for Computational Linguistics; 2000.

    Google Scholar 

  9. Witten IH. Text mining. In: Singh MP, editor. Practical handbook of internet computing. Boca Raton: Chapman and Hall/CRC Press; 2005. p. 14-1–14-22.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanli Cai .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Cai, Y., Sun, JT. (2018). Text Mining. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_418

Download citation

Publish with us

Policies and ethics