Abstract
We present a method to prune hypothesis spaces in the context of inductive logic programming. The main strategy of our method consists in removing hypotheses that are equivalent to already considered hypotheses. The distinguishing feature of our method is that we use learned domain theories to check for equivalence, in contrast to existing approaches which only prune isomorphic hypotheses. Specifically, we use such learned domain theories to saturate hypotheses and then check if these saturations are isomorphic. While conceptually simple, we experimentally show that the resulting pruning strategy can be surprisingly effective in reducing both computation time and memory consumption when searching for long clauses, compared to approaches that only consider isomorphism.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
As we show later in the paper, the completeness requirement disqualifies relative subsumption [15] as a candidate for such a pruning method.
- 2.
We use invariants based on a generalized version of Weisfeiler-Lehman procedure [22].
- 3.
This is true for the \(\theta \)-subsumption solver based on [8] which we use in our implementation.
- 4.
The formulation of Hebrand’s theorem used here is taken from notes by Cook and Pitassi: http://www.cs.toronto.edu/~toni/Courses/438/Mynotes/page39.pdf.
- 5.
Indeed, if \(\mathcal {T} \wedge \lnot C_{Sk}\) is unsatisfiable in \(\mathcal {L}\), then there is a set of corresponding \(\mathcal {L}\)-ground instances of clauses that are unsatisfiable. If we replace each constant appearing in these ground clauses which does not appear in \(C_{sk}\) by an arbitrary constant that does appear in \(C_{sk}\), then the resulting set of ground clauses must still be inconsistent, since T does not contain any constants and there is no equality in the language, meaning that \(\mathcal {T} \wedge \lnot C_{Sk}\) cannot be satisfiable in \(\mathcal {L}_{Sk}\).
- 6.
In the physical world, bonds do not necessarily have to be symmetric, e.g. there is an obvious asymmetry in polar bonds. However, it is a common simplification in data mining on molecular datasets to assume that bonds are symmetric.
- 7.
Note that we are slightly abusing notation here, as \(\theta ^{-1}\) is not a substitution.
- 8.
Note that we only use OI-subsumption to partially order the constructed hypotheses, not to check the entailment relation.
- 9.
What we call refinement operator in this paper is often called downward refinement operator. Since we only consider downward refinement operators in this paper, we omit the word downward.
- 10.
If we ordered the set of clauses by \(\theta \)-subsumption instead of OI-subsumption then there would not have to exist a maximal clause with this property.
- 11.
- 12.
A clause is said to be connected if it cannot be written as disjunction of two non-empty clauses. For instance \(\forall X,Y : p_1(X) \vee p_2(Y)\) is not connected because it can be written also as \((\forall X : p_1(X)) \vee (\forall Y : p_2(Y))\) but \(\forall X,Y : p_1(X) \vee p_2(Y) \vee p_3(X,Y)\) is connected. If a clause is connected then its saturation is also connected.
- 13.
Frequent conjunctive pattern mining can be emulated in our setting. It is enough to notice that the clauses that we construct are just negations of conjunctive patterns.
- 14.
Available from https://github.com/martinsvat.
References
Berre, D.L., Parrain, A.: The SAT4J library, release 2.2. J. Satisfiability Boolean Model. Comput. 7, 50–64 (2010)
Buntine, W.L.: Generalized subsumption and its applications to induction and redundancy. Artif. Intell. 36(2), 149–176 (1988)
Chekuri, C., Rajaraman, A.: Conjunctive query containment revisited. Theor. Comput. Sci. 239(2), 211–229 (2000)
Dechter, R.: Constraint Processing. Elsevier Morgan Kaufmann, San Francisco (2003)
Dehaspe, L., De Raedt, L.: Mining association rules in multiple relations. In: Lavrač, N., Džeroski, S. (eds.) ILP 1997. LNCS, vol. 1297, pp. 125–132. Springer, Heidelberg (1997). https://doi.org/10.1007/3540635149_40
Ferilli, S., Fanizzi, N., Di Mauro, N., Basile, T.M.: Efficient \(\theta \)-subsumption under object identity. In: 2002 AI*IA Workshop, pp. 59–68 (2002)
van Hoeve, W.J.: The alldifferent constraint: A survey (2001). CoRR cs.PL/0105015. http://arxiv.org/abs/cs.PL/0105015
Kuželka, O., Železný, F.: A restarted strategy for efficient subsumption testing. Fundam. Inform. 89(1), 95–109 (2008)
Kuželka, O., Železný, F.: Block-wise construction of tree-like relational features with monotone reducibility and redundancy. Mach. Learn. 83(2), 163–192 (2011)
Malerba, D.: Learning recursive theories in the normal ILP setting. Fundam. Inform. 57(1), 39–77 (2003)
Maloberti, J., Sebag, M.: Fast theta-subsumption with constraint satisfaction algorithms. Mach. Learn. 55(2), 137–174 (2004)
Muggleton, S.: Inverse entailment and progol. New Gen. Comput. 13(3–4), 245–286 (1995)
Newborn, M.: Automated Theorem Proving - Theory and Practice. Springer, New York (2001). https://doi.org/10.1007/978-1-4613-0089-2
Nijssen, S., Kok, J.N.: Efficient frequent query discovery in Farmer. In: Lavrač, N., Gamberger, D., Todorovski, L., Blockeel, H. (eds.) PKDD 2003. LNCS (LNAI), vol. 2838, pp. 350–362. Springer, Heidelberg (2003). https://doi.org/10.1007/978-3-540-39804-2_32
Plotkin, G.D.: A note on inductive generalization. Mach. Intell. 5(1), 153–163 (1970)
Raedt, L.D.: Logical settings for concept-learning. Artif. Intell. 95(1), 187–201 (1997)
Ralaivola, L., Swamidass, S.J., Saigo, H., Baldi, P.: Graph kernels for chemical informatics. Neural Netw. 18(8), 1093–1110 (2005)
Ramon, J., Roy, S., Jonny, D.: Efficient homomorphism-free enumeration of conjunctive queries. In: Preliminary Papers ILP 2011, p. 6 (2011)
Riedel, S.: Improving the accuracy and efficiency of MAP inference for markov logic. In: 24th Conference on Uncertainty in Artificial Intelligence, UAI 2008, pp. 468–475 (2008)
Stepp, R.E., Michalski, R.S.: Conceptual clustering: inventing goal-oriented classifications of structured objects. In: Machine Learning: An Artificial Intelligence Approach, vol. 2, pp. 471–498 (1986)
Tamaddoni-Nezhad, A., Muggleton, S.: The lattice structure and refinement operators for the hypothesis space bounded by a bottom clause. Mach. Learn. 76(1), 37–72 (2009)
Weisfeiler, B., Lehman, A.: A reduction of a graph to a canonical form and an algebra arising during this reduction. Nauchno-Technicheskaya Informatsia 2(9), 12–16 (1968)
Acknowledgements
MS, GŠ and FŽ acknowledge support by project no. 17-26999S granted by the Czech Science Foundation. This work was done while OK was with Cardiff University and supported by a grant from the Leverhulme Trust (RPG-2014-164). SS is supported by ERC Starting Grant 637277. Computational resources were provided by the CESNET LM2015042 and the CERIT Scientific Cloud LM2015085, provided under the programme “Projects of Large Research, Development, and Innovations Infrastructures”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Svatoš, M., Šourek, G., Železný, F., Schockaert, S., Kuželka, O. (2018). Pruning Hypothesis Spaces Using Learned Domain Theories. In: Lachiche, N., Vrain, C. (eds) Inductive Logic Programming. ILP 2017. Lecture Notes in Computer Science(), vol 10759. Springer, Cham. https://doi.org/10.1007/978-3-319-78090-0_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-78090-0_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-78089-4
Online ISBN: 978-3-319-78090-0
eBook Packages: Computer ScienceComputer Science (R0)