Abstract
Text is everywhere. Analysts at Gartner estimate that upward of 80 percent of enterprise data today is unstructured. Our everyday interactions generate torrents of such data, including tweets, blog posts, advertisements, news, articles, research papers, descriptions, emails, YouTube comments, Yelp reviews, surveys from your insurance company, and call transcripts; there is a tremendous amount of unstructured data, and the majority of it is text. Another general way to describe this large amount of mostly monetizable data (except YouTube comments—those are toxic!) is by classifying it as dark data. The origin of this term is not well known, but it was popularized by Stanford’s Dr. Chris Re, who founded the DeepDive program for extracting valuable information from dark data. The term pertains to the mountains of raw information collected in various ways, and such data remains difficult to analyze.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
A comprehensive list of Azure solution architectures to help you design and implement secure, highly available, performant, and resilient solutions on Azure can be found here: https://azure.microsoft.com/en-us/solutions/architecture/ .
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2019 Adnan Masood, Adnan Hashmi
About this chapter
Cite this chapter
Masood, A., Hashmi, A. (2019). Text Analytics: The Dark Data Frontier. In: Cognitive Computing Recipes. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-4106-6_4
Download citation
DOI: https://doi.org/10.1007/978-1-4842-4106-6_4
Published:
Publisher Name: Apress, Berkeley, CA
Print ISBN: 978-1-4842-4105-9
Online ISBN: 978-1-4842-4106-6
eBook Packages: Professional and Applied ComputingApress Access BooksProfessional and Applied Computing (R0)