Skip to main content

Text Similarity and Clustering

  • Chapter
  • First Online:
Book cover Text Analytics with Python
  • 6624 Accesses

Abstract

In the previous chapters, we covered several techniques to analyze text and extract interesting insights. We looked at supervised machine learning techniques, which are used to categorize text documents into several assumed categories. Unsupervised techniques like topic models and document summarization were also covered, which involved trying to retrieve key themes and information from large text documents and corpora.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Dipanjan Sarkar

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Sarkar, D. (2019). Text Similarity and Clustering. In: Text Analytics with Python. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-4354-1_7

Download citation

Publish with us

Policies and ethics