Skip to main content

A Tutorial on Query Answering and Reasoning over Probabilistic Knowledge Bases

  • Chapter
  • First Online:
Reasoning Web. Learning, Uncertainty, Streaming, and Scalability (Reasoning Web 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11078))

Included in the following conference series:

Abstract

Large-scale probabilistic knowledge bases are becoming increasingly important in academia and industry alike. They are constantly extended with new data, powered by modern information extraction tools that associate probabilities with knowledge base facts. This tutorial is dedicated to give an understanding of various query answering and reasoning tasks that can be used to exploit the full potential of probabilistic knowledge bases. In the first part of the tutorial, we focus on (tuple-independent) probabilistic databases as the simplest probabilistic data model. In the second part of the tutorial, we move on to richer representations where the probabilistic database is extended with ontological knowledge. For each part, we review some known data complexity results as well as discuss some recent results.

This tutorial is mostly based on the dissertation work [13], and previously published material [8, 14], and also makes use of some material from the classical literature on probabilistic databases [21, 68].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Abiteboul, S., Hull, R., Vianu, V. (eds.): Foundations of Databases: The Logical Level, 1st edn. Addison-Wesley Longman Publishing Co., Inc., Boston (1995)

    Google Scholar 

  2. Amarilli, A., Bourhis, P., Senellart, P.: Tractable lineages on treelike instances: limits and extensions. In: Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS-16), pp. 355–370. ACM (2016)

    Google Scholar 

  3. Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F. (eds.): The Description Logic Handbook: Theory, Implementation, and Applications, 2nd edn. Cambridge University Press, Cambridge (2007)

    MATH  Google Scholar 

  4. Baget, J.F., Mugnier, M.L., Rudolph, S., Thomazo, M.: Walking the complexity lines for generalized guarded existential rules. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI 2011), pp. 712–717 (2011)

    Google Scholar 

  5. Beeri, C., Vardi, M.Y.: The implication problem for data dependencies. In: Even, S., Kariv, O. (eds.) ICALP 1981. LNCS, vol. 115, pp. 73–85. Springer, Heidelberg (1981). https://doi.org/10.1007/3-540-10843-2_7

    Chapter  Google Scholar 

  6. Beigel, R., Reingold, N., Spielman, D.: PP is closed under intersection. J. Comput. Syst. Sci. 50(2), 191–202 (1995)

    Article  MathSciNet  Google Scholar 

  7. Bienvenu, M., Cate, B.T., Lutz, C., Wolter, F.: Ontology-based data access: A study through disjunctive Datalog, CSP, and MMSNP. ACM Trans. Database Syst. (TODS) 39(4), 33:1–33:44 (2014)

    Article  MathSciNet  Google Scholar 

  8. Borgwardt, S., Ceylan, İ.İ., Lukasiewicz, T.: Ontology-mediated queries for probabilistic databases. In: Proceedings of the 31th AAAI Conference on Artificial Intelligence (AAAI 2017), pp. 1063–1069. AAAI Press (2017)

    Google Scholar 

  9. Borgwardt, S., Ceylan, İ.İ., Lukasiewicz, T.: Recent advances in querying probabilistic knowledge bases. In: Lang, J. (ed.) Proceedings of the 27th International Joint Conference on Artificial Intelligence and the 23rd European Conference on Artificial Intelligence, IJCAI-ECAI 2018. IJCAI/AAAI Press (2018)

    Google Scholar 

  10. Calì, A., Gottlob, G., Kifer, M.: Taming the infinite chase: Query answering under expressive relational constraints. J. Artif. Intell. Res. 48, 115–174 (2013)

    Article  MathSciNet  Google Scholar 

  11. Calì, A., Gottlob, G., Lukasiewicz, T.: A general Datalog-based framework for tractable query answering over ontologies. J. Web Semant. 14, 57–83 (2012)

    Article  Google Scholar 

  12. Calì, A., Gottlob, G., Pieris, A.: Towards more expressive ontology languages: The query answering problem. Artif. Intell. 193, 87–128 (2012)

    Article  MathSciNet  Google Scholar 

  13. Ceylan, İ.İ.: Query answering in probabilistic data and knowledge bases. Ph.D. thesis, Technische Universität Dresden (2017)

    Google Scholar 

  14. Ceylan, İ.İ., Borgwardt, S., Lukasiewicz, T.: Most probable explanations for probabilistic database queries. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI 2017), pp. 950–956. AAAI Press (2017)

    Google Scholar 

  15. Ceylan, İ.İ., Darwiche, A., Van den Broeck, G.: Open-world probabilistic databases. In: Proceedings of the 15th International Conference on Principles of Knowledge Representation and Reasoning (KR 2016), pp. 339–348. AAAI Press (2016)

    Google Scholar 

  16. Ceylan, İ.İ., Darwiche, A., Van den Broeck, G.: Open-world probabilistic databases: an abridged report. In: Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI 2017), pp. 4796–4800. AAAI Press (2017)

    Google Scholar 

  17. Ceylan, İ.İ., Lukasiewicz, T., Peñaloza, R.: Complexity results for probabilistic Datalog\(\pm \). In: Proceedings of the 28th European Conference on Artificial Intelligence (ECAI 2016). Frontiers in Artificial Intelligence and Applications, vol. 285, pp. 1414–1422. IOS Press (2016)

    Google Scholar 

  18. Ceylan, İ.İ., Peñaloza, R.: Probabilistic query answering in the Bayesian description logic \(\cal{BE{}L}\). In: Beierle, C., Dekhtyar, A. (eds.) SUM 2015. LNCS (LNAI), vol. 9310, pp. 21–35. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23540-0_2

  19. Ceylan, İ.İ., Peñaloza, R.: The Bayesian ontology language \(\cal{BEL}\). J. Autom. Reason. 58(1), 67–95 (2017)

    Google Scholar 

  20. Dalvi, N., Suciu, D.: Efficient query evaluation on probabilistic databases. VLDB J. 16(4), 523–544 (2007)

    Article  Google Scholar 

  21. Dalvi, N., Suciu, D.: The dichotomy of probabilistic inference for unions of conjunctive queries. J. ACM 59(6), 1–87 (2012)

    Article  MathSciNet  Google Scholar 

  22. d’Amato, C., Fanizzi, N., Lukasiewicz, T.: Tractable reasoning with Bayesian description logics. In: Greco, S., Lukasiewicz, T. (eds.) SUM 2008. LNCS (LNAI), vol. 5291, pp. 146–159. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87993-0_13

    Chapter  Google Scholar 

  23. Darwiche, A.: Modeling and Reasoning with Bayesian Networks. Cambridge University Press, Cambridge (2009)

    Book  Google Scholar 

  24. De Raedt, L., Kimmig, A., Toivonen, H.: ProbLog: A probabilistic prolog and its application in link discovery. In: Proceedings of the 20th International Joint Conference on Artificial Intelligence (IJCAI 2007), pp. 2468–2473. Morgan Kaufmann (2007)

    Google Scholar 

  25. Dong, X., Gabrilovich, E., Heitz, G., Horn, W., Lao, N., Murphy, K., Strohmann, T., Sun, S., Zhang, W.: Knowledge Vault: A Web-scale approach to probabilistic knowledge fusion. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 601–610. ACM (2014)

    Google Scholar 

  26. Dong, X., Gabrilovich, E., Murphy, K., Dang, V., Horn, W., Lugaresi, C., Sun, S., Zhang, W.: Knowledge-based trust: Estimating the trustworthiness of Web sources. Proc. VLDB Endowment 8(9), 938–949 (2015)

    Article  Google Scholar 

  27. Fader, A., Soderland, S., Etzioni, O.: Identifying relations for open information extraction. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1535–1545. Association for Computational Linguistics (2011)

    Google Scholar 

  28. Fagin, R., Kolaitis, P.G., Miller, R.J., Popa, L.: Data exchange: Semantics and query answering. Theor. Comput. Sci. 336(1), 89–124 (2005)

    Article  MathSciNet  Google Scholar 

  29. Ferrucci, D.A.: Introduction to “This is Watson”. IBM J. Res. Dev. 56(3), 235–249 (2012)

    Google Scholar 

  30. Ferrucci, D., Levas, A., Bagchi, S., Gondek, D., Mueller, E.T.: Watson: Beyond Jeopardy!. Artif. Intell. 199–200, 93–105 (2013)

    Article  Google Scholar 

  31. Fuhr, N., Rölleke, T.: A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Trans. Database Syst. (TOIS) 15(1), 32–66 (1997)

    Article  Google Scholar 

  32. Gill, J.T.: Computatonal complexity of probabilistic turing machines. SIAM J. Comput. 6(4), 675–695 (1977)

    Article  MathSciNet  Google Scholar 

  33. Gottlob, G., Lukasiewicz, T., Martinez, M.V., Simari, G.I.: Query answering under probabilistic uncertainty in Datalog\(\pm \) ontologies. Ann. Math. Artif. Intell. 69(1), 37–72 (2013)

    Article  MathSciNet  Google Scholar 

  34. Grädel, E., Gurevich, Y., Hirsch, C.: The complexity of query reliability. In: Proceedings of the 17th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS 1998), pp. 227–234. ACM (1998)

    Google Scholar 

  35. Greenemeier, L.: Human traffickers caught on hidden internet. Sci. Am. 8 (2015)

    Google Scholar 

  36. Gribkoff, E., Suciu, D.: SlimShot: In-database probabilistic inference for knowledge bases. Proc. VLDB Endowment 9(7), 552–563 (2016)

    Article  Google Scholar 

  37. Gribkoff, E., Van den Broeck, G., Suciu, D.: The most probable database problem. In: Proceedings of the 1st International Workshop on Big Uncertain Data (BUDA) (2014)

    Google Scholar 

  38. Gribkoff, E., Van den Broeck, G., Suciu, D.: Understanding the complexity of lifted inference and asymmetric weighted model counting. In: Proceedings of the 30th Annual Conference on Uncertainty in Artificial Intelligence (UAI 2014), pp. 280–289. AUAI Press (2014)

    Google Scholar 

  39. Hesse, W., Allender, E., Barrington, D.A.M.: Uniform constant-depth threshold circuits for division and iterated multiplication. J. Comput. Syst. Sci. 65(4), 695–716 (2002)

    Article  MathSciNet  Google Scholar 

  40. Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia. Artif. Intell. 194, 28–61 (2013)

    Article  MathSciNet  Google Scholar 

  41. Imieliski, T., Lipski, W.: Incomplete information in relational databases. J. ACM 31(4), 761–791 (1984)

    Article  MathSciNet  Google Scholar 

  42. Jaeger, M.: Relational Bayesian networks. In: Proceedings of the 23rd Annual Conference on Uncertainty in Artificial Intelligence (UAI 1997), pp. 266–273. Morgan Kaufmann (1997)

    Google Scholar 

  43. Jung, J.C., Lutz, C.: Ontology-based access to probabilistic data with OWL QL. In: Cudré-Mauroux, P., et al. (eds.) ISWC 2012. LNCS, vol. 7649, pp. 182–197. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35176-1_12

    Chapter  Google Scholar 

  44. Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge (2009)

    MATH  Google Scholar 

  45. Krötzsch, M., Rudolph, S.: Extending decidable existential rules by joining acyclicity and guardedness. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence (IJCAI 2011), pp. 963–968. AAAI Press (2011)

    Google Scholar 

  46. Ku, J.P., Hicks, J.L., Hastie, T., Leskovec, J., Ré, C., Delp, S.L.: The mobilize center: an NIH big data to knowledge center to advance human movement research and improve mobility. J. Am. Med. Inform. Assoc. 22(6), 1120–1125 (2015)

    Google Scholar 

  47. Libkin, L.: Elements of Finite Model Theory. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-662-07003-1

    Book  MATH  Google Scholar 

  48. Littman, M.L., Goldsmith, J., Mundhenk, M.: The computational complexity of probabilistic planning. J. Artif. Intell. Res. 9, 1–36 (1998)

    Article  MathSciNet  Google Scholar 

  49. Lukasiewicz, T., Straccia, U.: Managing uncertainty and vagueness in description logics for the Semantic Web. J. Web Semant. 6(4), 291–308 (2008)

    Article  Google Scholar 

  50. Mitchell, T., et al.: Never-ending learning. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence (AAAI 2015), pp. 2302–2310 (2015)

    Google Scholar 

  51. Olteanu, D., Huang, J.: Using OBDDs for efficient query evaluation on probabilistic databases. In: Greco, S., Lukasiewicz, T. (eds.) SUM 2008. LNCS (LNAI), vol. 5291, pp. 326–340. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87993-0_26

    Chapter  Google Scholar 

  52. Olteanu, D., Huang, J.: Secondary-storage confidence computation for conjunctive queries with inequalities. In: Proceedings of the 2009 ACM SIGMOD International Conference on Management of Data, pp. 389–402. ACM (2009)

    Google Scholar 

  53. Papadimitriou, C.H.: Computational Complexity. Addison-Wesley, Boston (1994)

    MATH  Google Scholar 

  54. Pearl, J.: Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, Burlington (1988)

    MATH  Google Scholar 

  55. Peters, S.E., Zhang, C., Livny, M., Ré, C.: A machine reading system for assembling synthetic paleontological databases. PLoS One 9(12), e113523 (2014)

    Article  Google Scholar 

  56. Poggi, A., Lembo, D., Calvanese, D., De Giacomo, G., Lenzerini, M., Rosati, R.: Linking data to ontologies. J. Data Semant. 10, 133–173 (2008)

    MATH  Google Scholar 

  57. Poole, D.: The independent choice logic for modelling multiple agents under uncertainty. Artif. Intell. 94(1–2), 7–56 (1997)

    Article  MathSciNet  Google Scholar 

  58. Provan, J.S., Ball, M.O.: The complexity of counting cuts and of computing the probability that a graph is connected. SIAM J. Comput. 12(4), 777–788 (1983)

    Article  MathSciNet  Google Scholar 

  59. Ré, C., Suciu, D.: The trichotomy of having queries on a probabilistic database. VLDB J. 18(5), 1091–1116 (2009)

    Article  Google Scholar 

  60. Reiter, R.: On closed world data bases. In: Gallaire, H., Minker, J. (eds.) Logic and Data Bases, pp. 55–76. Springer, Heidelberg (1978). https://doi.org/10.1007/978-1-4684-3384-5_3

    Chapter  Google Scholar 

  61. Richardson, M., Domingos, P.: Markov logic networks. Mach. Learn. 62(1), 107–136 (2006)

    Article  Google Scholar 

  62. Rossman, B.: Homomorphism preservation theorems. J. ACM 55(3), 1–53 (2008)

    Article  MathSciNet  Google Scholar 

  63. Sato, T.: A statistical learning method for logic programs with distribution semantics. In: Proceedings of the 12th International Conference on Logic Programming (ICLP 1995), pp. 715–729. MIT Press (1995)

    Google Scholar 

  64. Sato, T., Kameya, Y.: PRISM: A language for symbolic-statistical modeling. In: Proceedings of the 15th International Joint Conference on Artificial Intelligence (IJCAI 1997), pp. 1330–1335. Morgan Kaufmann (1997)

    Google Scholar 

  65. Shin, J., Wu, S., Wang, F., De Sa, C., Zhang, C., Ré, C.: Incremental knowledge base construction using DeepDive. Proc. VLDB Endowment 8(11), 1310–1321 (2015)

    Article  Google Scholar 

  66. Sipser, M.: Introduction to the Theory of Computation, 1st edn. International Thomson Publishing, Boston (1996)

    MATH  Google Scholar 

  67. Staworko, S., Chomicki, J.: Consistent query answers in the presence of universal constraints. Inf. Syst. 35(1), 1–22 (2010)

    Article  Google Scholar 

  68. Suciu, D., Olteanu, D., Ré, C., Koch, C.: Probabilistic Databases. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, San Rafael (2011)

    MATH  Google Scholar 

  69. Toda, S.: On the computational power of PP and \(\oplus \)P. In: Proceedings of the 30th Annual Symposium on Foundations of Computer Science, pp. 514–519 (1989)

    Google Scholar 

  70. Toda, S., Watanabe, O.: Polynomial-time 1-turing reductions from #PH to #P. Theor. Comput. Sci. 100(1), 205–221 (1992)

    Article  MathSciNet  Google Scholar 

  71. Valiant, L.G.: The complexity of computing the permanent. Theor. Comput. Sci. 8(2), 189–201 (1979)

    Article  MathSciNet  Google Scholar 

  72. Vardi, M.Y.: The complexity of relational query languages. In: Lewis, H.R., Simons, B.B., Burkhard, W.A., Landweber, L.H. (eds.) Proceedings of the 14th Annual ACM Symposium on Theory of Computing (STOC 1982), pp. 137–146. ACM (1982)

    Google Scholar 

  73. Vollmer, H.: Introduction to Circuit Complexity: A Uniform Approach. Springer, Heidelberg (1999). https://doi.org/10.1007/978-3-662-03927-4

    Book  MATH  Google Scholar 

  74. Wagner, K.W.: The complexity of combinatorial problems with succinct input representation. Acta Informatica 23(3), 325–356 (1986)

    Article  MathSciNet  Google Scholar 

  75. Wu, W., Li, H., Wang, H., Zhu, K.Q.: Probase: A probabilistic taxonomy for text understanding. In: Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data, pp. 481–492. ACM (2012)

    Google Scholar 

Download references

Acknowledgments

This work was supported by The Alan Turing Institute under the UK EPSRC grant EP/N510129/1, and by the EPSRC grants EP/R013667/1, EP/L012138/1, and EP/M025268/1.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to İsmail İlkan Ceylan or Thomas Lukasiewicz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Ceylan, İ.İ., Lukasiewicz, T. (2018). A Tutorial on Query Answering and Reasoning over Probabilistic Knowledge Bases. In: d’Amato, C., Theobald, M. (eds) Reasoning Web. Learning, Uncertainty, Streaming, and Scalability. Reasoning Web 2018. Lecture Notes in Computer Science(), vol 11078. Springer, Cham. https://doi.org/10.1007/978-3-030-00338-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-00338-8_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-00337-1

  • Online ISBN: 978-3-030-00338-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics