Abstract
A recurrent problem in the development of reasoning agents is how to assign degrees of beliefs to uncertain events in a complex environment. The standard knowledge representation framework imposes a sharp separation between learning and reasoning; the agent starts by acquiring a “model” of its environment, represented into an expressive language, and then uses this model to quantify the likelihood of various queries. Yet, even for simple queries, the problem of evaluating probabilities from a general purpose representation is computationally prohibitive. In contrast, this study embarks on the learning to reason (L2R) framework that aims at eliciting degrees of belief in an inductive manner. The agent is viewed as an anytime reasoner that iteratively improves its performance in light of the knowledge induced from its mistakes. Indeed, by coupling exponentiated gradient strategies in learning and weighted model counting techniques in reasoning, the L2R framework is shown to provide efficient solutions to relational probabilistic reasoning problems that are provably intractable in the classical paradigm.
Article PDF
Similar content being viewed by others
References
Abadi, M., & Halpern, J. Y. (1994). Decidability and expressiveness for first-order logics of probability. Information and Computation, 112(1), 1–36.
Angluin, D. (1988). Queries and concept learning. Machine Learning, 2(4), 319–342.
Bacchus, F., Grove, A. J., Halpern, J. Y., & Koller, D. (1996). From statistical knowledge bases to degrees of belief. Artificial Intelligence, 87(1–2), 75–143.
Büning, H. K., & Zhao, X. (2001). Satisfiable formulas closed under replacement. Electronic Notes in Discrete Mathematics, 9, 48–58.
Bylander, T. (1998). Worst-case analysis of the perceptron and exponentiated update algorithms. Artificial Intelligence, 106(2), 335–352.
Cesa-Bianchi, N. (1999). Analysis of two gradient-based algorithms for on-line regression. Journal of Computer and System Sciences, 59(3), 392–411.
Chavira, M., & Darwiche, A. (2008, to appear). On probabilistic inference by weighted model counting. Artificial Intelligence.
Chavira, M., Darwiche, A., & Jaeger, M. (2006). Compiling relational Bayesian networks for exact inference. International Journal of Approximate Reasoning, 42(1–2), 4–20.
Chen, J., Kanj, I. A., & Xia, G. (2005). Simplicity is beauty: improved upper bounds for vertex cover (Tech. Rep. TR05-008). De Paul University, Chicago, IL.
Costa, V. S., Page, D., Qazi, M., & Cussens, J. (2003). CLP(BN): constraint logic programming for probabilistic knowledge. In Proceedings of the nineteenth conference in uncertainty in artificial intelligence (pp. 517–524). Acapulco: Morgan Kaufmann.
Cox, R. T. (1946). Probability, frequency, and reasonable expectation. American Journal of Physics, 14, 1–13.
Cumby, C. M., & Roth, D. (2000). Relational representations that facilitate learning. In Proceedings or the seventeenth international conference on the principles of knowledge representation and reasoning (pp. 425–434). Breckenridge: Morgan Kaufmann.
Darwiche, A. (2003). A differential approach to inference in Bayesian networks. Journal of the ACM, 50(3), 280–305.
Darwiche, A., & Marquis, P. (2002). A knowledge compilation map. Journal of Artificial Intelligence Research, 17, 229–264.
De Raedt, L., & Kersting, K. (2004). Probabilistic inductive logic programming. In Proceedings of the fifteenth international conference on algorithmic learning theory (pp. 19–36). Padova: Springer.
Del Val, A. (2005). First order LUB approximations: characterization and algorithms. Artificial Intelligence, 162(1–2), 7–48.
Friedman, N., Getoor, L., Koller, D., & Pfeffer, A. (1999). Learning probabilistic relational models. In Proceedings of the sixteenth international joint conference on artificial intelligence (pp. 1300–1309). Stockholm: Morgan Kaufmann.
Gentile, C. (2003). The robustness of the p-norm algorithms. Machine Learning, 53(3), 265–299.
Getoor, L., & Taskar, B. (2007). Introduction to statistical relational learning. Cambridge: Cambridge University Press.
Greiner, R., Grove, A. J., & Schuurmans, D. (1997). Learning Bayesian nets that perform well. In Proceedings of the thirteenth conference on uncertainty in artificial intelligence (pp. 198–207). Providence: Morgan Kaufmann.
Grove, A. J., Halpern, J. Y., & Koller, D. (1994). Random worlds and maximum entropy. Journal of Artificial Intelligence Research, 2, 33–88.
Grove, A. J., Littlestone, N., & Schuurmans, D. (2001). General convergence results for linear discriminant updates. Machine Learning, 43(3), 173–210.
Halpern, J. Y. (2003). Reasoning about uncertainty. Cambridge: MIT Press.
Helmbold, D. P., Schapire, R. E., Singer, Y., & Warmuth, M. K. (1997). A comparison of new and old algorithms for a mixture estimation problem. Machine Learning, 27(1), 97–119.
Jaeger, M. (1997). Relational Bayesian networks. In Proceedings of the thirteenth conference on uncertainty in artificial intelligence (pp. 266–273). Providence: Morgan Kaufmann.
Jaeger, M. (2000). On the complexity of inference about probabilistic relational models. Artificial Intelligence, 117(2), 297–308.
Kersting, K. (2006). Frontiers in artificial intelligence and applications: Vol. 148. An inductive logic programming approach to statistical relational learning. Amsterdam: IOS Press.
Khardon, R. (1999). Learning to take actions. Machine Learning, 35(1), 57–90.
Khardon, R., & Roth, D. (1997). Learning to reason. Journal of the ACM, 44(5), 697–725.
Khardon, R., & Roth, D. (1999). Learning to reason with a restricted view. Machine Learning, 35(2), 95–116.
Kivinen, J., & Warmuth, M. K. (1997). Exponentiated gradient versus gradient descent for linear predictors. Information and Computation, 132(1), 1–63.
Kok, S., & Domingos, P. (2005). Learning the structure of Markov logic networks. In Proceedings of the twenty-second international conference in machine learning (pp. 441–448). Bonn: ACM.
Liberatore, P. (1998). Compilation of intractable problems and its application to artificial intelligence. PhD thesis, Dipartimento di Informatica e Sistemistica, Università di Roma “La Sapienza”, Rome, Italy.
Littlestone, N. (1988). Learning quickly when irrelevant attributes abound: a new linear-threshold algorithm. Machine Learning, 2(4), 285–318.
Littlestone, N. (1989). Mistake bounds and logarithmic linear-threshold learning algorithms. PhD thesis, University of California, Santa Cruz, CA.
Manzano, M. (2005). Extensions of first-order logic. Cambridge: Cambridge University Press.
Mihalkova, L., Huynh, T., & Mooney, R. J. (2007). Mapping and revising Markov logic networks for transfer learning. In Proceedings of the twenty-second AAAI conference on artificial intelligence (pp. 608–614). Vancouver: AAAI Press.
Muggleton, S. (1996). Stochastic logic programs. In L. D. Readt (Ed.), Advances in inductive logic programming (pp. 254–264). Amsterdam: IOS Press.
Ngo, L., & Haddawy, P. (1997). Answering queries from context-sensitive probabilistic knowledge bases. Theoretical Computer Science, 171(1–2), 147–177.
Nishimura, N., Ragde, P., & Szeider, S. (2006). Solving #SAT using vertex covers. In Proceedings of the ninth international conference in theory and applications of satisfiability testing (pp. 396–409). Seattle: Springer.
Pearl, J. (1988). Probabilistic reasoning in intelligent systems: networks of plausible inference. San Mateo: Morgan Kaufmann.
Pfeffer, A. (2000). Probabilistic reasoning for complex systems. PhD thesis, Computer Science Department, Stanford University, CA.
Poole, D. (1993). Probabilistic horn abduction and Bayesian networks. Artificial Intelligence, 64, 81–129.
Richardson, M., & Domingos, P. (2006). Markov logic networks. Machine Learning, 62(1–2), 107–136.
Roth, D. (1996). On the hardness of approximate reasoning. Artificial Intelligence, 82(1–2), 273–302.
Ruan, Y., Kautz, H. A., & Horvitz, E. (2004). The backdoor key: a path to understanding problem hardness. In Proceedings of the nineteenth national conference on artificial intelligence (pp. 124–130). San Jose: AAAI Press.
Sang, T., Beame, P., & Kautz, H. A. (2005). Performing Bayesian inference by weighted model counting. In Proceedings of the twentieth national conference on artificial intelligence (pp. 475–482). Pittsburgh: AAAI Press.
Selman, B., & Kautz, H. A. (1996). Knowledge compilation and theory approximation. Journal of the ACM, 43(2), 193–224.
Taskar, B., Abbeel, P., & Koller, D. (2002). Discriminative probabilistic models for relational data. In Proceedings of the eighteenth conference in uncertainty in artificial intelligence (pp. 485–492). Edmonton: Morgan Kaufmann.
Taskar, B., Chatalbashev, V., & Koller, D. (2004). Learning associative Markov networks. In Proceedings of the twenty-first international conference in machine learning. Banff: ACM.
Taskar, B., Guestrin, C., & Koller, D. (2003). Max-margin Markov networks. In Advances in neural information processing systems 16. Vancouver: MIT Press.
Valiant, L. G. (1994). Circuits of the mind. New York: Oxford University Press.
Valiant, L. G. (2000a). A neuroidal architecture for cognitive computation. Journal of the ACM, 47(5), 854–882.
Valiant, L. G. (2000b). Robust logics. Artificial Intelligence, 117(2), 231–253.
Williams, R., Gomes, C. P., & Selman, B. (2003). Backdoors to typical case complexity. In Proceedings of the eighteenth international joint conference on artificial intelligence (pp. 1173–1178). Acapulco: Morgan Kaufmann.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editors: Hendrik Blockeel, Jude Shavlik, Prasad Tadepalli.
Rights and permissions
About this article
Cite this article
Koriche, F. Learning to assign degrees of belief in relational domains. Mach Learn 73, 25–53 (2008). https://doi.org/10.1007/s10994-008-5075-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-008-5075-5