Skip to main content

Text Analytics: The Dark Data Frontier

  • Chapter
  • First Online:
Cognitive Computing Recipes

Abstract

Text is everywhere. Analysts at Gartner estimate that upward of 80 percent of enterprise data today is unstructured. Our everyday interactions generate torrents of such data, including tweets, blog posts, advertisements, news, articles, research papers, descriptions, emails, YouTube comments, Yelp reviews, surveys from your insurance company, and call transcripts; there is a tremendous amount of unstructured data, and the majority of it is text. Another general way to describe this large amount of mostly monetizable data (except YouTube comments—those are toxic!) is by classifying it as dark data. The origin of this term is not well known, but it was popularized by Stanford’s Dr. Chris Re, who founded the DeepDive program for extracting valuable information from dark data. The term pertains to the mountains of raw information collected in various ways, and such data remains difficult to analyze.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 29.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 39.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A comprehensive list of Azure solution architectures to help you design and implement secure, highly available, performant, and resilient solutions on Azure can be found here: https://azure.microsoft.com/en-us/solutions/architecture/ .

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Adnan Masood, Adnan Hashmi

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Masood, A., Hashmi, A. (2019). Text Analytics: The Dark Data Frontier. In: Cognitive Computing Recipes. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-4106-6_4

Download citation

Publish with us

Policies and ethics