Skip to main content

Pruning Hypothesis Spaces Using Learned Domain Theories

  • Conference paper
  • First Online:
Book cover Inductive Logic Programming (ILP 2017)

Abstract

We present a method to prune hypothesis spaces in the context of inductive logic programming. The main strategy of our method consists in removing hypotheses that are equivalent to already considered hypotheses. The distinguishing feature of our method is that we use learned domain theories to check for equivalence, in contrast to existing approaches which only prune isomorphic hypotheses. Specifically, we use such learned domain theories to saturate hypotheses and then check if these saturations are isomorphic. While conceptually simple, we experimentally show that the resulting pruning strategy can be surprisingly effective in reducing both computation time and memory consumption when searching for long clauses, compared to approaches that only consider isomorphism.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    As we show later in the paper, the completeness requirement disqualifies relative subsumption [15] as a candidate for such a pruning method.

  2. 2.

    We use invariants based on a generalized version of Weisfeiler-Lehman procedure [22].

  3. 3.

    This is true for the \(\theta \)-subsumption solver based on [8] which we use in our implementation.

  4. 4.

    The formulation of Hebrand’s theorem used here is taken from notes by Cook and Pitassi: http://www.cs.toronto.edu/~toni/Courses/438/Mynotes/page39.pdf.

  5. 5.

    Indeed, if \(\mathcal {T} \wedge \lnot C_{Sk}\) is unsatisfiable in \(\mathcal {L}\), then there is a set of corresponding \(\mathcal {L}\)-ground instances of clauses that are unsatisfiable. If we replace each constant appearing in these ground clauses which does not appear in \(C_{sk}\) by an arbitrary constant that does appear in \(C_{sk}\), then the resulting set of ground clauses must still be inconsistent, since T does not contain any constants and there is no equality in the language, meaning that \(\mathcal {T} \wedge \lnot C_{Sk}\) cannot be satisfiable in \(\mathcal {L}_{Sk}\).

  6. 6.

    In the physical world, bonds do not necessarily have to be symmetric, e.g. there is an obvious asymmetry in polar bonds. However, it is a common simplification in data mining on molecular datasets to assume that bonds are symmetric.

  7. 7.

    Note that we are slightly abusing notation here, as \(\theta ^{-1}\) is not a substitution.

  8. 8.

    Note that we only use OI-subsumption to partially order the constructed hypotheses, not to check the entailment relation.

  9. 9.

    What we call refinement operator in this paper is often called downward refinement operator. Since we only consider downward refinement operators in this paper, we omit the word downward.

  10. 10.

    If we ordered the set of clauses by \(\theta \)-subsumption instead of OI-subsumption then there would not have to exist a maximal clause with this property.

  11. 11.

    For instance, Farmr [14] or RelF [9] remove isomorphic clauses (or conjunctive patterns), but many existing ILP systems do not attempt removing isomorphic clauses.

  12. 12.

    A clause is said to be connected if it cannot be written as disjunction of two non-empty clauses. For instance \(\forall X,Y : p_1(X) \vee p_2(Y)\) is not connected because it can be written also as \((\forall X : p_1(X)) \vee (\forall Y : p_2(Y))\) but \(\forall X,Y : p_1(X) \vee p_2(Y) \vee p_3(X,Y)\) is connected. If a clause is connected then its saturation is also connected.

  13. 13.

    Frequent conjunctive pattern mining can be emulated in our setting. It is enough to notice that the clauses that we construct are just negations of conjunctive patterns.

  14. 14.

    Available from https://github.com/martinsvat.

References

  1. Berre, D.L., Parrain, A.: The SAT4J library, release 2.2. J. Satisfiability Boolean Model. Comput. 7, 50–64 (2010)

    Google Scholar 

  2. Buntine, W.L.: Generalized subsumption and its applications to induction and redundancy. Artif. Intell. 36(2), 149–176 (1988)

    Article  MathSciNet  MATH  Google Scholar 

  3. Chekuri, C., Rajaraman, A.: Conjunctive query containment revisited. Theor. Comput. Sci. 239(2), 211–229 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  4. Dechter, R.: Constraint Processing. Elsevier Morgan Kaufmann, San Francisco (2003)

    MATH  Google Scholar 

  5. Dehaspe, L., De Raedt, L.: Mining association rules in multiple relations. In: Lavrač, N., Džeroski, S. (eds.) ILP 1997. LNCS, vol. 1297, pp. 125–132. Springer, Heidelberg (1997). https://doi.org/10.1007/3540635149_40

    Chapter  Google Scholar 

  6. Ferilli, S., Fanizzi, N., Di Mauro, N., Basile, T.M.: Efficient \(\theta \)-subsumption under object identity. In: 2002 AI*IA Workshop, pp. 59–68 (2002)

    Google Scholar 

  7. van Hoeve, W.J.: The alldifferent constraint: A survey (2001). CoRR cs.PL/0105015. http://arxiv.org/abs/cs.PL/0105015

  8. Kuželka, O., Železný, F.: A restarted strategy for efficient subsumption testing. Fundam. Inform. 89(1), 95–109 (2008)

    MATH  Google Scholar 

  9. Kuželka, O., Železný, F.: Block-wise construction of tree-like relational features with monotone reducibility and redundancy. Mach. Learn. 83(2), 163–192 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  10. Malerba, D.: Learning recursive theories in the normal ILP setting. Fundam. Inform. 57(1), 39–77 (2003)

    MathSciNet  MATH  Google Scholar 

  11. Maloberti, J., Sebag, M.: Fast theta-subsumption with constraint satisfaction algorithms. Mach. Learn. 55(2), 137–174 (2004)

    Article  MATH  Google Scholar 

  12. Muggleton, S.: Inverse entailment and progol. New Gen. Comput. 13(3–4), 245–286 (1995)

    Article  Google Scholar 

  13. Newborn, M.: Automated Theorem Proving - Theory and Practice. Springer, New York (2001). https://doi.org/10.1007/978-1-4613-0089-2

    Book  MATH  Google Scholar 

  14. Nijssen, S., Kok, J.N.: Efficient frequent query discovery in Farmer. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 350–362. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-39804-2_32

    Chapter  Google Scholar 

  15. Plotkin, G.D.: A note on inductive generalization. Mach. Intell. 5(1), 153–163 (1970)

    MathSciNet  MATH  Google Scholar 

  16. Raedt, L.D.: Logical settings for concept-learning. Artif. Intell. 95(1), 187–201 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  17. Ralaivola, L., Swamidass, S.J., Saigo, H., Baldi, P.: Graph kernels for chemical informatics. Neural Netw. 18(8), 1093–1110 (2005)

    Article  Google Scholar 

  18. Ramon, J., Roy, S., Jonny, D.: Efficient homomorphism-free enumeration of conjunctive queries. In: Preliminary Papers ILP 2011, p. 6 (2011)

    Google Scholar 

  19. Riedel, S.: Improving the accuracy and efficiency of MAP inference for markov logic. In: 24th Conference on Uncertainty in Artificial Intelligence, UAI 2008, pp. 468–475 (2008)

    Google Scholar 

  20. Stepp, R.E., Michalski, R.S.: Conceptual clustering: inventing goal-oriented classifications of structured objects. In: Machine Learning: An Artificial Intelligence Approach, vol. 2, pp. 471–498 (1986)

    Google Scholar 

  21. Tamaddoni-Nezhad, A., Muggleton, S.: The lattice structure and refinement operators for the hypothesis space bounded by a bottom clause. Mach. Learn. 76(1), 37–72 (2009)

    Article  Google Scholar 

  22. Weisfeiler, B., Lehman, A.: A reduction of a graph to a canonical form and an algebra arising during this reduction. Nauchno-Technicheskaya Informatsia 2(9), 12–16 (1968)

    Google Scholar 

Download references

Acknowledgements

MS, GŠ and FŽ acknowledge support by project no. 17-26999S granted by the Czech Science Foundation. This work was done while OK was with Cardiff University and supported by a grant from the Leverhulme Trust (RPG-2014-164). SS is supported by ERC Starting Grant 637277. Computational resources were provided by the CESNET LM2015042 and the CERIT Scientific Cloud LM2015085, provided under the programme “Projects of Large Research, Development, and Innovations Infrastructures”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Martin Svatoš .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Svatoš, M., Šourek, G., Železný, F., Schockaert, S., Kuželka, O. (2018). Pruning Hypothesis Spaces Using Learned Domain Theories. In: Lachiche, N., Vrain, C. (eds) Inductive Logic Programming. ILP 2017. Lecture Notes in Computer Science(), vol 10759. Springer, Cham. https://doi.org/10.1007/978-3-319-78090-0_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-78090-0_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-78089-4

  • Online ISBN: 978-3-319-78090-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics