Synonyms
Probabilistic databases; Probabilistic information extraction; Probabilistic knowledge bases
Definition
Entity extraction is the process of extracting structured entities with corresponding attributes from unstructured text data. For example, a structured paper entity can be extracted from a citation with corresponding author names, title, and journal names. Alternatively, a professor entity can be extracted from his or her homepage with corresponding job title, email, and research interests. The result of entity extraction is a set of structured entity records.
Probabilistic entity extractions are structured entity attributes and records extracted from text each associated with probability of correctness. The probability of correctness is usually generated from the state-of-the-art statistical information extraction models due to the imperfect nature of automatic entity extraction process.
The management of probabilistic entity extractions requires not only scalable execution...
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Dalvi N, Suciu D. Efficient Query Evaluation on Probabilistic Databases. In: Proceedings of the 30th International Conference on Very Large Data Bases; 2004.
Doan A, Ramakrishnan R, Chen F, DeRose P, Lee Y, McCann R, Sayyadian M, Shen W. Community information management. 2006.
Gupta R, Sarawagi S. Curating probabilistic databases from information extraction models. In: Proceedings of the 32nd International Conference on Very Large Data Bases; 2006.
Manning CD, Schütze H. Foundations of statistical natural language processing. Cambridge, MA: MIT Press; 1999.
Reiss F, Raghavan S, Krishnamurthy R, Zhu H, Vaithyanathan S. An algebraic approach to rule-based information extraction. In: Proceedings of the 24th International Conference on Data Engineering; 2008.
Shen W, Doan A, Naughton J, Ramakrishnan R. Declarative information extraction using datalog with embedded extraction predicates. In: Proceedings of the 33rd International Conference on Very Large Data Bases; 2007.
Suciu D, Olteanu D, Ré C, Koch C. Probabilistic databases, synthesis lectures on data management. San Rafael: Morgan and Claypool; 2011.
Wang D, Michelakis E, Garofalakis M, Hellerstein J. BayesStore: managing large, uncertain data repositories with probabilistic graphical models. In: Proceedings of the 33rd International Conference on Very Large Data Bases; 2008.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Wang, D.Z. (2018). Managing Probabilistic Entity Extraction. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_80762
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8265-9_80762
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering