Machine Learning for Text: An Introduction

Aggarwal, Charu C.

doi:10.1007/978-3-319-73531-3_1

Charu C. Aggarwal²

10k Accesses
4 Citations

Abstract

The extraction of useful insights from text with various types of statistical algorithms is referred to as text mining, text analytics, or machine learning from text. The choice of terminology largely depends on the base community of the practitioner. This book will use these terms interchangeably. Text analytics has become increasingly popular in recent years because of the ubiquity of text data on the Web, social networks, emails, digital libraries, and chat sites.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Hardcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Bibliography

C. Aggarwal. Data mining: The textbook. Springer, 2015.
Google Scholar
C. Aggarwal, and C. Zhai, Mining text data. Springer, 2012.
Google Scholar
R. Baeza-Yates, and B. Ribeiro-Neto. Modern information retrieval. ACM press, 2011.
Google Scholar
R. Banchs. Text Mining with MATLAB. Springer, 2012.
Google Scholar
C. M. Bishop. Pattern recognition and machine learning. Springer, 2007.
Google Scholar
S. Buttcher, C. Clarke, and G. V. Cormack. Information retrieval: Implementing and evaluating search engines. The MIT Press, 2010.
Google Scholar
S. Chakrabarti. Mining the Web: Discovering knowledge from hypertext data. Morgan Kaufmann, 2003.
Google Scholar
W. B. Croft, D. Metzler, and T. Strohman. Search engines: Information retrieval in practice, Addison-Wesley Publishing Company, 2009.
Google Scholar
R. Feldman and J. Sanger. The text mining handbook: advanced approaches in analyzing unstructured data. Cambridge University Press, 2007.
Google Scholar
J. Han, M. Kamber, and J. Pei. Data mining: concepts and techniques. Morgan Kaufmann, 2011.
Google Scholar
T. Hastie, R. Tibshirani, and J. Friedman. The elements of statistical learning. Springer, 2009.
Google Scholar
D. Jurafsky and J. Martin. Speech and language processing. Prentice Hall, 2008.
Google Scholar
B. Liu. Web data mining: exploring hyperlinks, contents, and usage data. Springer, New York, 2007.
Google Scholar
C. Manning, P. Raghavan, and H. Schütze. Introduction to information retrieval. Cambridge University Press, Cambridge, 2008.
Google Scholar
C. Manning and H. Schütze. Foundations of statistical natural language processing. MIT Press, 1999.
Google Scholar
A. McCallum. Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering. http://www.cs.cmu.edu/~mccallum/bow, 1996.
T. M. Mitchell. Machine learning. McGraw Hill International Edition, 1997.
Google Scholar
F. Moosmann, B. Triggs, and F. Jurie. Fast Discriminative visual codebooks using randomized clustering forests. NIPS Conference, pp. 985–992, 2006.
Google Scholar
G. Salton and M. J. McGill. Introduction to modern information retrieval. McGraw Hill, 1986.
Google Scholar
P.-N Tan, M. Steinbach, and V. Kumar. Introduction to data mining. Addison-Wesley, 2005.
Google Scholar
S. Weiss, N. Indurkhya, and T. Zhang. Fundamentals of predictive text mining. Springer, 2015.
Google Scholar
C. Zhai and S. Massung. Text data management and mining: A practical introduction to information retrieval and text mining. Association of Computing Machinery/Morgan and Claypool Publishers, 2016.
Google Scholar
https://archive.ics.uci.edu/ml/datasets.html
http://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html
https://cran.r-project.org/web/packages/tm/
http://www.cs.waikato.ac.nz/ml/weka/
http://nlp.stanford.edu/software/
http://www.nltk.org/

Download references

Author information

Authors and Affiliations

IBM T. J. Watson Research Center, Yorktown Heights, NY, USA
Charu C. Aggarwal

Authors

Charu C. Aggarwal
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Aggarwal, C.C. (2018). Machine Learning for Text: An Introduction. In: Machine Learning for Text. Springer, Cham. https://doi.org/10.1007/978-3-319-73531-3_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-73531-3_1
Published: 20 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73530-6
Online ISBN: 978-3-319-73531-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics