Managing Probabilistic Entity Extraction
Probabilistic databases; Probabilistic information extraction; Probabilistic knowledge bases
Entity extraction is the process of extracting structured entities with corresponding attributes from unstructured text data. For example, a structured paper entity can be extracted from a citation with corresponding author names, title, and journal names. Alternatively, a professor entity can be extracted from his or her homepage with corresponding job title, email, and research interests. The result of entity extraction is a set of structured entity records.
Probabilistic entity extractions are structured entity attributes and records extracted from text each associated with probability of correctness. The probability of correctness is usually generated from the state-of-the-art statistical information extraction models due to the imperfect nature of automatic entity extraction process.
The management of probabilistic entity extractions requires not only scalable execution...
- 2.Doan A, Ramakrishnan R, Chen F, DeRose P, Lee Y, McCann R, Sayyadian M, Shen W. Community information management. 2006.Google Scholar
- 3.Gupta R, Sarawagi S. Curating probabilistic databases from information extraction models. In: Proceedings of the 32nd International Conference on Very Large Data Bases; 2006.Google Scholar
- 5.Reiss F, Raghavan S, Krishnamurthy R, Zhu H, Vaithyanathan S. An algebraic approach to rule-based information extraction. In: Proceedings of the 24th International Conference on Data Engineering; 2008.Google Scholar
- 6.Shen W, Doan A, Naughton J, Ramakrishnan R. Declarative information extraction using datalog with embedded extraction predicates. In: Proceedings of the 33rd International Conference on Very Large Data Bases; 2007.Google Scholar
- 8.Wang D, Michelakis E, Garofalakis M, Hellerstein J. BayesStore: managing large, uncertain data repositories with probabilistic graphical models. In: Proceedings of the 33rd International Conference on Very Large Data Bases; 2008.Google Scholar