Probabilistic databases extend standard databases with probabilities, in order to model uncertainties in data; Query evaluation becomes probabilistic inference
A probabilistic database is a database in which every tuple t belongs to the database with some probability P(t); when P(t) = 1 then the tuple is certain to belong to the database; when 0 < P(t) < 1 then it belongs to the database only with some probability; when P(t) = 0 then the tuple is certain not to belong to the database, and we usually don’t even bother representing it. A traditional (deterministic) database corresponds to the case when P(t) = 1 for all tuples t. Tuples with P(t) > 0 are called possible tuples. In addition to indicating the probabilities for all tuples, a probabilistic database must also indicate somehow how the tuples are correlated. In the simplest cases the tuples are declared to be either independent (when P(t1t2) = P(t1t2)), or exclusive (or disjoint, when P(t1t2) = 0).
- 2.Carlson A, Betteridge J, Kisiel B, Settles B, Hruschka ER Jr, Mitchell TM. Toward an architecture for never-ending language learning. In: Proceedings of the 24th National Conference on Artificial Intelligence; 2010.Google Scholar
- 5.Van den Broeck G, Suciu D. Tutorial: Lifted probabilistic inference in relational models. In: Proceedings of the 25th International Joint Conference on AI; 2016.Google Scholar
- 7.Dong X, Gabrilovich E, Heitz G, Horn W, Lao N, Murphy K, Strohmann T, Sun S, Zhang W. Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining; 2014.Google Scholar
- 8.Fader A, Soderland S, Etzioni O. Identifying relations for open information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing. A meeting of SIGDAT, a special interest group of the ACL; 2011. p. 1535–45.Google Scholar
- 15.Wu W, Li H, Wang H, Zhu KQ. Probase: a probabilistic taxonomy for text understanding. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2012. p. 481–92Google Scholar