Abstract
We introduce the problem of learning the parameters of the probabilistic database ProbLog. Given the observed success probabilities of a set of queries, we compute the probabilities attached to facts that have a low approximation error on the training examples as well as on unseen examples. Assuming Gaussian error terms on the observed success probabilities, this naturally leads to a least squares optimization problem. Our approach, called LeProbLog, is able to learn both from queries and from proofs and even from both simultaneously. This makes it flexible and allows faster training in domains where the proofs are available. Experiments on real world data show the usefulness and effectiveness of this least squares calibration of probabilistic databases.
Chapter PDF
Similar content being viewed by others
References
Getoor, L., Taskar, B. (eds.): Statistical Relational Learning. MIT Press, Cambridge (2007)
De Raedt, L., Frasconi, P., Kersting, K., Muggleton, S. (eds.): Probabilistic Inductive Logic Programming. LNCS (LNAI), vol. 4911, pp. 1–27. Springer, Heidelberg (2008)
Dalvi, N.N., Suciu, D.: Efficient query evaluation on probabilistic databases. In: Proceedings of VLDB, pp. 864–875 (2004)
De Raedt, L., Kimmig, A., Toivonen, H.: ProbLog: A probabilistic Prolog and its application in link discovery. In: Veloso, M. (ed.) IJCAI, pp. 2462–2467 (2007)
Cussens, J.: Parameter estimation in stochastic logic programs. MLJ 44(3), 245–271 (2001)
Sato, T., Kameya, Y.: Parameter learning of logic programs for symbolic-statistical modeling. J. Artif. Intell. Res. (JAIR) 15, 391–454 (2001)
Friedman, N., Getoor, L., Koller, D., Pfeffer, A.: Learning probabilistic relational models. In: IJCAI, pp. 1300–1309 (1999)
Kersting, K., De Raedt, L.: Basic principles of learning bayesian logic programs. In: De Raedt, L., Frasconi, P., Kersting, K., Muggleton, S. (eds.) Probabilistic Inductive Logic Programming. LNCS (LNAI), vol. 4911, pp. 189–221. Springer, Heidelberg (2008)
Chen, J., Muggleton, S., Santos, J.: Learning probabilistic logic models from probabilistic examples (extended abstract). In: Blockeel, H., Ramon, J., Shavlik, J., Tadepalli, P. (eds.) ILP 2007. LNCS (LNAI), vol. 4894, pp. 22–23. Springer, Heidelberg (2008)
De Raedt, L., Kersting, K.: Probabilistic Logic Learning. ACM-SIGKDD Explorations: Special issue on Multi-Relational Data Mining 5(1), 31–48 (2003)
Charniak, E.: Tree-bank grammars. In: AAAI/IAAI, vol. 2, pp. 1031–1036 (1996)
De Raedt, L., Kersting, K., Torge, S.: Towards learning stochastic logic programs from proof-banks. In: AAAI, pp. 752–757 (2005)
Nottelmann, H., Fuhr, N.: Learning probabilistic datalog rules for information classification and transformation. In: CIKM, pp. 387–394. ACM, New York (2001)
Gupta, R., Sarawagi, S.: Creating probabilistic databases from information extraction models. In: VLDB, pp. 965–976 (2006)
De Raedt, L., Kersting, K., Kimmig, A., Revoredo, K., Toivonen, H.: Compressing probabilistic prolog programs. Machine Learning 70(2-3), 151–168 (2008)
Wrobel, S., Wettschereck, D., Sommer, E., Emde, W.: Extensibility in data mining systems. In: KDD, pp. 214–219 (1996)
Kimmig, A., De Raedt, L., Toivonen, H.: Probabilistic explanation based learning. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 176–187. Springer, Heidelberg (2007)
Lloyd, J.W.: Foundations of Logic Programming, 2nd edn. Springer, Berlin (1989)
Bryant, R.E.: Graph-based algorithms for boolean function manipulation. IEEE Trans. Computers 35(8), 677–691 (1986)
Poole, D.: Logic programming, abduction and probability - a top-down anytime algorithm for estimating prior and posterior probabilities. New Generation Comput. 11(3), 377–400 (1993)
De Raedt, L., Kersting, K.: Probabilistic inductive logic programming. In: Ben-David, S., Case, J., Maruoka, A. (eds.) ALT 2004. LNCS (LNAI), vol. 3244, pp. 19–36. Springer, Heidelberg (2004)
Mitchell, T.M.: Machine Learnning. McGraw-Hill, New York (1997)
Saul, L., Jaakkola, T., Jordan, M.: Mean field theory for sigmoid belief networks. JAIR 4, 61–76 (1996)
Sevon, P., Eronen, L., Hintsanen, P., Kulovesi, K., Toivonen, H.: Link discovery in graphs derived from biological databases. In: Leser, U., Naumann, F., Eckman, B. (eds.) DILS 2006. LNCS (LNBI), vol. 4075, pp. 35–49. Springer, Heidelberg (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gutmann, B., Kimmig, A., Kersting, K., De Raedt, L. (2008). Parameter Learning in Probabilistic Databases: A Least Squares Approach. In: Daelemans, W., Goethals, B., Morik, K. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2008. Lecture Notes in Computer Science(), vol 5211. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87479-9_49
Download citation
DOI: https://doi.org/10.1007/978-3-540-87479-9_49
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87478-2
Online ISBN: 978-3-540-87479-9
eBook Packages: Computer ScienceComputer Science (R0)