Abstract
Feature selection methods often improve the performance of attribute-value learning. We explore whether also in relational learning, examples in the form of clauses can be reduced in size to speed up learning without affecting the learned hypothesis. To this end, we introduce the notion of safe reduction: a safely reduced example cannot be distinguished from the original example under the given hypothesis language bias. Next, we consider the particular, rather permissive bias of bounded treewidth clauses. We show that under this hypothesis bias, examples of arbitrary treewidth can be reduced efficiently. We evaluate our approach on four data sets with the popular system Aleph and the state-of-the-art relational learner nFOIL. On all four data sets we make learning faster in the case of nFOIL, achieving an order-of-magnitude speed up on one of the data sets, and more accurate in the case of Aleph.
Similar content being viewed by others
Notes
This is not always the case. For example, we have a decision procedure for x-subsumption w.r.t. the set of all clauses โ the ordinary ๐-subsumption which is decidable but NP-hard.
Note again the terminology used in this paper following Atserias et al. (2007). In CSP-literature, it is often common to call 2-consistency what we call 1-consistency.
We are applying ๐ also to e because e need not be ground.
A first-order interpretation consists of several components, one them is the domain of discourse, and another is a function ฯ which maps constants to elements of the domain of discourse. We assume w.l.o.g that there is a constant c d for every element d of the domain of discourse.
References
Appice, A., Ceci,M., Rawles, S., Flach, P.A. (2004). Redundant feature elimination for multi-class problems. In ICML (vol. 69).
Atserias, A., Bulatov, A., Dalmau, V. (2007). On the power of k-consistency. In Proceedings of ICALP-2007 (pp. 266โ271).
Beeri, C., Fagin, R., Maier, D., Yannakakis, M. (1983). On the desirability of acyclic database schemes. Journal of ACM, 30(3), 479โ513.
Bodlaender, H.L., & Mohring, R.H. (1993). The pathwidth and treewidth of cographs. SIAM Journal of Discrete Methematics, 6, 238โ255.
Courcelle, B. (1990). The monadic second-order logic of graphs. i. recognizable sets of finite graphs. Information and Computation, 85(1), 12โ75.
De Raedt, L. (1997).) Logical settings for concept-learning. Artificial Intelligence, 95(1), 187โ201.
De Raedt, L. (2008). Logical and relational learning. New York: Springer.
Dechter, R. (2003). Constraint processing. San Francisco: Morgan Kaufmann.
Erickson, J. (2009). CS 598: Computational topology, course notes, University of Illinois at Urbana-Champaign. http://compgeom.cs.uiuc.edu/~jeffe/teaching/comptop/.
Fagin, R. (1983). Degrees of acyclicity for hypergraphs and relational database schemes. Journal of the ACM, 30(3), 514โ550.
Feder, T., & Vardi, M.Y. (1998). The computational structure of monotone monadic snp and constraint satisfaction: a study through datalog and group theory. SIAM Journal on Computing, 28(1), 57โ104.
Freuder, E.C. (1990). Complexity of k-tree structured constraint satisfaction problems. In Proceedings of the eighth national conference on artificial intelligence (vol. 1, pp. 4โ9). AAAIโ90: AAAI Press.
Hastie, T., Tibshirani, R., Friedman, J. (2001). The elements of statistical learning: data mining, inference, and prediction. New York: Springer.
Helma, C., King, R.D., Kramer, S., Srinivasan, A. (2001). The predictive toxicology challenge 2000โ2001. Bioinformatics, 17(1), 107โ108.
Krogel, M.A., Rawles, S., ลฝeleznรฝ, F., Flach, P., Lavrac, N., Wrobel, S. (2003). Comparative evaluation of approaches to propositionalization. In ILP. Springer.
Kuลพelka, O., & ลฝeleznรฝ, F. (2009). Block-wise construction of acyclic relational features with monotone irreducibility and relevancy properties. In ICML 2009: the 26th International Conference on Machine Learning.
Kuลพelka, O., ลฝeleznรฝ, F. (2011a). Block-wise construction of tree-like relational features with monotone reducibility and redundancy. Machine Learning, 83, 163โ192.
Kuลพelka, O., ลฝeleznรฝ, F. (2011b). Seeing the world through homomorphism: An experimental study on reducibility of examples. In ILPโ10: Inductive logic programming (pp. 138โ145).
Kuลพelka, O., Szabรณovรก, A., ลฝeleznรฝ, F. (2013a). Bounded least general generalization. In ILPโ12: inductive logic programming.
Kuลพelka, O., Szabรณovรก, A., ลฝeleznรฝ, F. (2013b). Reducing examples in relational learning with bounded-treewidth hypotheses. In New frontiers in mining complex patterns (pp. 17โ32).
Landwehr, N., Kersting, K., Raedt, L.D. (2007). Integrating naรฏve bayes and FOIL. Journal of Machine Learning Research, 8, 481โ507.
Lavraฤ, N., Gamberger, D., Jovanoski, V. (1999). A study of relevance for learning in deductive databases. Journal of Logic Programming, 40(2/3), 215โ249.
Liu, H.,Motoda, H., Setiono, R., Zhao, Z. (2010). Feature selection: an ever evolving frontier in data mining. Journal of Machine Learning Research - Proceedings Track, 10, 4โ13.
Mackworth, A. (1977). Consistency in networks of relations. Artificial Intelligence, 8(1), 99โ118.
Maloberti, J., & Sebag, M. (2004). Fast theta-subsumption with constraint satisfaction algorithms. Machine Learning, 55(2), 137โ174.
Muggleton, S. (1995). Inverse entailment and Progol. New Generation Computing, Special Issue on Inductive Logic Programming, 13(3โ4), 245โ286.
Nassif, H., Al-Ali, H., Khuri, S., Keirouz, W., Page, D. (2009). An inductive logic programming approach to validate hexose biochemical knowledge. In: Proceedings of the 19th international conference on ILP (pp. 149โ165). Leuven.
Nienhuys-Cheng, S.H., de Wolf, R., (eds.) (1997). Foundations of inductive logic programming. Lecture Notes in Computer Science (vol. 1228). Springer.
Plotkin, G. (1970). A note on inductive generalization. Edinburgh: Edinburgh University Press.
Rossi, F., van Beek, P., Walsh T., (Eds.) (2006). Handbook of constraint programming. New York: Elsevier.
ลฝakovรก, M., ลฝeleznรฝ, F., Garcia-Sedano, J., Tissot, C.M., Lavraฤ, N., Kลemen, P., Molina, J. (2007). Relational data mining applied to virtual engineering of product designs. In ILP06, LNAI (vol. 4455, pp. 439โ453). Springer.
Acknowledgments
This work was supported by the Czech Grant Agency through project 103/11/2170 Transferring ILP techniques to SRL. The authors would like to thank the anonymous reviewers of NFMCPโ12 and of the JIIS special issue for helpful remarks.
Author information
Authors and Affiliations
Corresponding author
Appendix A: Propositions and Proofs
Appendix A: Propositions and Proofs
Proposition 9
Let X be a set of clauses. If โผ X is x-subsumption w.r.t. X and โ X is an x-presubsumption w.r.t. X then (A โ X B) โ (A โผ X B) for any two clauses A, B (not necessarily from X).
Proof
We need to show that if A โ x B then (C โผ ๐ A) โ (C โผ ๐ B) for all clauses C โ X. First, if A โ x B and C โ ๐ A then the proposition holds trivially. Second, C โผ ๐ A means that there is a substitution ๐ such that C๐ โ A. This implies C๐ โ X B using the condition 1 from definition of x-presubsumption. Now, we can use the second condition which gives us C โผ ๐ B (note that C โ X and C๐ โ X B). โก
Proposition 10
Let k โ N and let โ k be a relation on clauses defined as follows: A โ k B if and only if the k-consistency algorithm run on the CSP-encoding of the ๐-subsumption problem A โผ ๐ B returns true. The relation โ k is an x-presubsumption w.r.t. the set X k of all clauses with treewidth at most k.
Proof
We need to verify that โ k satisfies the conditions stated in Definition 9.
-
1.
If A โ k B and C โ A then C โ k B. This holds because if the k-consistency algorithm returns true for a problem then it must also return true for any of its subproblems (recall the discussion in Section 2.4). It is easy to check that if C โ A are clauses then the CSP problem encoding the ๐-subsumption problem C โผ ๐ B is a subproblem of the CSP encoding of the ๐-subsumption problem A โผ ๐ B. Therefore this condition holds.
-
2.
If A โ X, ๐ is a substitution and A๐ โ x B then A โผ ๐ B. If A๐ โ k B then the k-consistency algorithm applied on the CSP encoding ๐ซ๐ = (๐ฑ๐, ๐๐, ๐๐) of the problem A๐ โผ ๐ B finishes with a non-empty set of partial solutions H ๐ . When applied on the CSP encoding P = (๐ฑ, ๐, ๐)s of the problem A โผ ๐ B, the k-consistency algorithm finishes with a set of partial solutions H. What we need to show is that this set H is non-empty. We will do this by showing that H has a non-empty subset H โ which can be constructed as follows. For each partial solution ฯ โ H ๐ , we take all sets of variables V โ ๐ฑ such that |V| โค k + 1 and V๐ โ Supp(ฯ) and, for each such set v, we add to H โ a new partial solution ฯ โ such that Supp(ฯ โ) = V and ฯ โ(v) = ฯ(v๐) for all v โ Supp(ฯ โ). Clearly, any such ฯ โis a valid partial solution of ๐ซ because otherwise ฯ could not have been a valid partial solution of ๐ซ๐. Moreover, for every ฯ โ โ H โwith |Supp(ฯ โ) | โค k and every variable v โ ๐ฑ, there exists a partial solution ฯ โ โ H โ such that ฯ โ โ ฯ โ and v โ Supp(ฯ โ). This can be shown by contradiction as follows. Let us suppose that there is a partial solution ฯ โ โ H โ such that |Supp(ฯ โ) | โค k and a variable v โ ๐ฑ such that there is no ฯ โ โ H โ which would satisfy ฯ โ โ ฯ โand v โ Supp(ฯ โ). But then the respective solution ฯ โ H ๐ from which ฯ โ was constructed should have been removed by the k-consistency algorithm because there could not have been any partial solution ฯ โ H ๐ such that ฯ โ ฯ and v๐ โ Supp(ฯ) which is a contradiction. For every partial solution ฯ โ โ H โ, the set H โ must also contain all partial โsub-solutionsโ (i.e. solutions ฯโฒโ such that ฯโฒโ โ ฯ โ). As a consequence of the above, the k-consistency algorithm running on the CSP encoding of the problem A โผ ๐ B cannot remove from H any partial solution contained in the set H โ and, thus, H โ โ H and the k-consistency algorithm must return the value true. Since A โ X k , it must also hold A โผ ๐ B.
-
3.
If A โผ ๐ B then A โ k B. This is a property of k-consistency.
โก
Proposition 11
Let us have a set X and a polynomial-time decision procedure for checking โ X which is an x-presubsumption w.r.t. the set X. Then, given a clause A on input, the literal-elimination algorithm finishes in polynomial time and outputs a clause ร satisfying the following conditions:
-
1.
ร โผ ๐ A and A โผ X ร where โผ X is an x-subsumption w.r.t. the set X.
-
2.
|ร| โค |ร ๐ | where ร ๐ is a ๐-reduction of a subset of Aโs literals with maximum length.
Proof
We start by proving ร โผ ๐ A and A โผ X ร. This can be shown as follows. First, A โผ X Aโฒ holds in any step of the algorithm which follows from (Aโฒ โ X Aโฒโ{L}) โ (Aโฒ โผ X Aโฒโ{L}) โ recall that Aโฒ is replaced by Aโฒโ{L} in the literal elimination algorithm if and only if Aโฒ โ X Aโฒโ{L} โ and from transitivity of x-subsumption. Consequently we also have A โผ X ร because ร = Aโฒ in the last step of the algorithm. Second, ร โผ ๐ A because ร โ A. Now, we prove the second part of the proposition. What remains to be shown is that the resulting clause ร will not be bigger than ร ๐ . Since ร โ A, it suffices to show that ร cannot be ๐-reducible. Let us assume, for contradiction, that it is ๐-reducible. If ร was ๐-reducible, there would have to be a literal L โ ร such that ร โผ๐ รโ{L}. The relation โ X satisfies (A โผ ๐ B) โ (A โ X B) therefore it would also have to hold Aโฒ โ X Aโฒโ{L}. However, then L should have been removed by the literal-elimination algorithm which is a contradiction with ร being output of it. The fact that the literal-elimination algorithm finishes in polynomial time follows from the fact that, for a given clause A, it calls the polynomial-time procedure for checking the relation โ X at most |A|2 times (the other operations of the literal-elimination algorithm can be performed in polynomial time as well). โก
We will need the following simple lemma in the proof of Proposition 6.
Lemma 1
Let A be a clause. If A is ๐-reducible then A๐ โ ๐ A where ๐ = {V/t} for some V โ vars(A) and t โ terms(A) , V โ t and |vars(A๐)| < |vars(A)| .
Proof
Let A be ๐-reducible and let ร ๐ โ A be ๐-reduction of A. There must be a substitution ๐ โ such that A๐ โ = ร ๐ and |A๐ โ | < |A| and therefore also |vars(A๐ โ) | < |vars(A) | (because if |vars(A๐ โ) | = |vars(A) | then A and A๐ โ would be isomorphic and they would have to contain the same number of literals). Let V โ be a variable contained in vars(A) but not contained in vars(A๐ โ) (there must be some such variable because |vars(A๐ โ) | < |vars(A) | ) and let ๐ = {V โ/t} โ ๐ โ. There are two cases depending on whether T is a variable or a constant. Let us start with the case when T is a variable which we denote as W = t for clarity. Now, there must be a variable S โ V โ such that {S/W} โ ๐ which can be shown as follows. If there is no such variable (i.e. if the only variable which is mapped on W is V โ) then we can construct a new substitution ๐ โ โ = ๐ โ ๐ โ which has the following two โinterestingโ properties (โinterestingโ because they will allow us to arrive at a contradiction): (i) A๐ โ โ โ ๐ A and (ii) |vars(A๐ โ โ) | < |vars(A๐ โ) | . The first property holds because A โผ ๐ A โ โ ร ๐ โ A implies A โผ ๐ A๐ โ๐โ โ ร ๐ . The second property can be shown as follows. Clearly, none of the variables not in vars(A๐ โ) can reappear in vars(A๐ โ โ). Moreover, the variable W cannot be contained in vars(A๐ โ โ) because V โ ๐ โ โ โ W and there was no other variable mapped to V โor W by ๐ โ. Therefore |vars(A๐ โ โ) | < |vars(A๐ โ) | , i.e. the second property must be true as well. However, then we have a contradiction because we assumed that ร ๐ was ๐-reduction but ร ๐ cannot be a ๐-reduction because A โ ๐ A๐ โ โand |A๐ โ โ | < |A๐ โ| (which follows from |vars (A๐ โโ)| < |vars(A๐ โ)| = |vars(ร ๐ )| and A๐ โโ โ ร ๐ ). Thus, there must be a variable S โ V โ such that {S/W} โ ๐. It follows that if we set ๐ = {V โ/S} it must hold A๐ โ = A๐๐ โ. Therefore A๐ โผ ๐ A๐ โ โ ๐ A. Now, if T is not a variable but a constant, we can simply set ๐ = {V โ/t} and it must hold A๐ โ = A๐๐ โ and therefore also A๐ โผ ๐ A โ โ ๐ A. Since also trivially A โผ ๐ A๐, we have A๐ โ ๐ A and |vars(A๐) | < |vars(A) | which finishes the proof of this Lemma. โก
Proposition 12
Let us have a set X and a polynomial-time decision procedure for checking โ X which is an x-presubsumption w.r.t. the set X. Then, given a clause A on input, the literal substitution algorithm finishes in polynomial time and outputs a clause ร satisfying the following conditions:
-
1.
ร โผ X A and A โผ ๐ ร where โผ X is the x-subsumption w.r.t. the set x.
-
2.
|terms(ร)| โค |terms(ร ๐ )| where ร ๐ is a ๐-reduction of a clause A๐ with maximum cardinality of the set (ร ๐ ) where ๐ is some substitution mapping variables to elements of the set terms(A).
Proof
We start by proving ร โผ X A and A โผ ๐ ร. This can be shown as follows. First, Aโฒ โผ X A holds in any step of the algorithm which follows from (Aโฒ๐ โ X Aโฒ) โ (Aโฒ๐ โผ X Aโฒ) โ recall that Aโฒis replaced by Aโฒ๐ in the literal substitution algorithm if and only if A>โฒ๐ โ X Aโฒ โ and from transitivity of x-subsumption. Consequently we also have ร โผ X A because ร = Aโฒ in the last step of the algorithm. Second, A โผ ๐ ร because ร = A๐โฒ for some substitution ๐โฒ( ๐โฒis the substitution composed of the substitutions ๐ applied on Aโฒin the literal substitution algorithm). Now, we prove the second part of the proposition. What remains to be shown is that the resulting clause ร will not have more terms than ร ๐ . Since ร = A๐โฒ, it suffices to show that ร cannot be ๐-reducible. Let us assume, for contradiction, that it is ๐-reducible. If ร was ๐-reducible, then there would have to be a substitution ๐ = {V/t} such that A๐ โ ๐ A where ๐ = {V/t} for some v โ vars(A), t โ terms(A), V โ t (this follows from Lemma 1). The relation โ X satisfies (A โผ ๐ B) โ (A โ X B) therefore it would also have to hold Aโฒ๐ โ X Aโฒ. However, then ๐ should have been applied on Aโฒ by the literal substitution algorithm which is a contradiction with ร being output of it. The fact that the literal-substitution algorithm finishes in polynomial time follows from the fact that, for a given clause A, it calls the polynomial-time procedure for checking the relation โ X at most |A| 3times (the other operations of the literal-substitution algorithm can be performed in polynomial time as well). โก
Lemma 2
(Plotkin 1970) Let A and B be clauses. If A โผ ๐ B then A โง B.
Proposition 13
Let โ be a hypothesis language and let e be a clause. Let eอ be a clause obtained from e by variabilizing the constants which are not contained in the hypothesis language. Then (โ โง e) โ (โ โง eอ) for any โ โ โ. If รช is a ๐-reduction of eอ and |รช| < |e| then รช is also a safe reduction of e.
Proof
We will start by showing validity of the implication (H โง e) โ (H โง eอ ). For contradiction, let us assume that H โง e and that there is a model M of the clausal theory H such that M โง e and M โฎ eอ . Then there must be a substitution ๐ grounding all variables in eอ such thatFootnote 4 M โง e๐ and M โฎ eอ ๐. Now, we will construct another model Mโฒof H in which e will not be satisfied. We take each constant C in e that has been replaced by a variable v in eอ and update the assignment ฯ of the constants C to objects from the domain of discourse in the modelFootnote 5 so that ฯ(c) = ฯ(V๐). Clearly, we can do this for every constant C since every constant in e has been replaced by exactly one variable. Now, we see that Mโฒ โฎ e. However, we are not done yet as it might happen that the new model with the modified ฯ would no longer be a model of H. However, this is clearly not the case since none of the constants C appears in H and therefore the change of ฯ has no effect whatsoever on whether or not H is true in Mโฒ. So, we have arrived at a contradiction. We have a model Mโฒsuch that Mโฒ โง H and Mโฒ โฎ e which contradicts the assumption H โง e. The implication (H โง e) โ (H โง eอ ) follows directly from Lemma 2. We have eอ โผ ๐ e therefore also eอ โง e and finally eอ โผ ๐ e. (ii) In order to show (H โง e) โ (H โง รช), it suffices to notice that H โง e and e โผ ๐ รช imply H โง รช. The implication (H โง e) โ (H โง รช) may be shown similarly as follows: H โง รช and รช โผ ๐ e imply H โง e. โก
The next lemmas are used to prove Proposition 8. In Lemmas 3 and 4, we formulate rather general conditions under which x-equivalence w.r.t. a set X implies safe equivalence w.r.t. to the set 2X. Then, in the subsequent lemmas, we show that these conditions are satisfied by the set of bounded-treewidth Horn clauses.
Lemma 3
Let X be a set of clauses and let โ โ 2X be a hypothesis language. Let A and B be clauses. Let A โ X B w.r.t. the set X and let the following be true for any โ โ โ and any clause C: if โ โง C and C is not a tautology then there is a clause D โ X such that โ โง D and D โผ ๐ C. Then for any โ โ โ, it holds (โ โง A) โ (โ โง B).
Proof
First, we need to consider the case when A and B are both tautologies. If both A and B are tautologies then (โ โง A) โ (โ โง B) naturally holds for any โ. Now, we can consider the case when at most one of the clauses A and B is a tautology. Let us assume w.l.o.g. that if one of the clauses is a tautology then it is the clause B. If โ โง A then there is a clause D โ X such that โ โง D and D โผ ๐ A (by the assumptions of the proposition). Since D โ X, D โผ ๐ A and A โผ X B, we have D โผ ๐ B (from the definition of x-subsumption) and finally also โ โง D โง B and so โ โง B. The other implication can be shown in a completely similar fashion if A is not a tautology. โก
Lemma 4
Let X be a set of Horn clauses such that any clause which can be derived from a clausal theory โ โ 2X using SLD resolution is contained in X. If e and รช are two Horn clauses such that e โ X รช then for any โ โ 2X: (โ โง e) โ (โ โง รช).
Proof
We will use the subsumption theorem for Horn clauses and Lemma 3. We will show that the conditions of this lemma imply conditions of Lemma 3. If A is a non-tautological clause and โ โง A then by the subsumption theorem there must be a clause C derivable from โ using resolution (SLD resolution, respectively) such that C โผ ๐ A. Therefore for any non-tautological clause A, if โ โง A where โ โ 2Xthen there must be a clause C โ X such that โ โง C (because resolution is sound) and C โผ ๐ A. Now, since e โ X รช , we may finish the proof using Lemma 3 which gives us (โ โง e) โ (โ โง รช ) for any โ โ 2X. โก
Lemma 5
(Clique containment, lemma Bodlaender and Mohring 1993) Let A be a clause and T A be its tree decomposition. For any l โ A, there is a vertex in T A labelled by a set of variables v such that vars(l) โ V.
Proof
The proof can be found in Bodlaender and Mohring (1993). More precisely, the paper (Bodlaender and Mohring 1993) contains a lemma which states that if C is a clique in a graph G = (V, E) then any tree decomposition of G contains a vertex labelled by a set of vertices L such that C โ L. Our statement of this lemma in terms of clauses and tree decompositions of clauses then follows immediately from this result which can be shown by noticing that the decomposition of the clause A can be easily converted to a tree decomposition of Aโs Gaifman graph G A where, for any l โ A, var(l) corresponds to a clique in G A . โก
Lemma 6
Let A be a function-free clause and T A be its tree decomposition. Let l โ โ A be a literal and let ๐ be a substitution not affecting any variable in the set vars(A)โvars(l โ), mapping variables to variables or terms (i.e. not to function symbols) and never mapping any variables to elements of the set vars(A). Then a tree decomposition of A๐ can be obtained by applying the substitution ๐ on the variables contained in the labels of the tree decomposition T A and removing constants from these sets if necessary โ we denote the new labelled tree by T A ๐. As a consequence, the treewidth of A๐ is never greater than the treewidth of A.
Proof
If we apply the substitution ๐ on the labels of the tree decomposition T A then none of the label-sets associated to the vertices of T A increases in size (this is in part due to the fact that we do not consider function symbols). Therefore if we are able to show that T A ๐ is a tree decomposition of A๐ then we will automatically get also the result that the treewidth of A๐ is not greater than the treewidth of A. So, let us show that T A ๐ is indeed a tree decomposition of A๐.
-
(i)
Claim: For every variableV โ vars(A๐) there is a vertex of T A ๐ labelled by a set containingV . This is obvious.
-
(ii)
Claim: For every pair of variablesu, V which both appear in a literall โ A๐, there is a vertex of T A ๐ labelled by a set containing both U and V . There must be a literal lโฒ โ A containing two variables Uโฒand Vโฒsuch that Uโฒ๐ = U and Vโฒ๐ = V (because U and V are both contained in a literal). Therefore there must be a vertex T of T A labelled by a set S t containing both Uโฒand Vโฒ. After applying the substitution ๐ on the set S t , we get a set by which some vertex contained in T A ๐ is labelled and it contains both U and V .
-
(iii)
Claim: For everyV โ vars(A๐), the set of vertices of T A ๐ labelled by sets containingV forms a connected subgraph of T A ๐. Let us assume (for contradiction) that there is a variable V โ vars(A๐) such that the set of vertices of T A ๐ labelled by sets containing V forms a disconnected graph. It follows that there must be two variables Uโฒ, Vโฒ( Uโฒ โ Vโฒ) such that Uโฒ๐ = Vโฒ๐ = V and the sets of vertices S Uโฒ and S Vโฒ of the tree decomposition T A corresponding to the variables Uโฒand Vโฒ, respectively, must be disjoint (because the set of vertices with labels containing a given variable must form a connected subgraph in any tree decomposition). However, the variables Uโฒand Vโฒ must appear in the literal l โ because the substitution ๐ affects only variables contained in l โ and maps no variables of A to elements of the set vars(A) (since at least one of the variables must have been affected by the substitution and since it is equal to the other variable, the other variable must have been affected by the substitution as well). Thus, since both Uโฒ and Vโฒ must be contained in l โ there must be a vertex in the tree decomposition T A labelled by a set which contains both Uโฒ and Vโฒ. The sets of vertices of T A labelled by sets containing the variables Uโฒ and Vโฒ, respectively, therefore cannot be disjoint which is a contradiction.
We have thus shown that T A ๐ is a tree decomposition of A๐. โก
Lemma 7
Let A = l 1 โจ l 2 โจ โฏ โจ l m and B = m 1 โจ m 2 โจ โฏ โจ m n be two standardized-apart function-free clauses. Let ๐ be a most general unifier of the literals l i , ยฌ m j not affecting any variable in the set (vars(A) โช vars(B))โ(vars(l i ) โช vars(m j )) such that vars(l i ๐) โฉ vars(A) = vars(l i ๐) โฉ vars(B) = โ . Next, let
be a binary resolvent of A and B. Then for the treewidth k C of C, it holds k C โค max{k A , k B } where k A is the treewidth of A and k B is the treewidth of B.
Proof
Using Lemma 6, we get that A๐ has a tree decomposition T A๐ of width at most k A and B๐ has a tree decomposition T B๐ of width at most k B . We will now show how to construct a tree decomposition of width at most max{k A , k B } for the clause C. Let V A ( V B , respectively) be a vertex from T A๐ ( T B๐ , respectively) which is labelled by a set of variables ๐ฑ A ( ๐ฑ B , respectively) such that vars(l i ๐) โ ๐ฑ A ( vars(l i ๐) โ ๐ฑ B ) โ such vertices must exist by Lemma 5. We construct the new tree decomposition T C by connecting T A๐ and T B๐ by a new edge between the vertices V A and V B (we may remove the variables not contained in the clause C from the labels of the vertices of T C ). Clearly, T C has width at most max{k A , k B }. We need to show that it is indeed a tree decomposition of C. The first two conditions from Definition 7 are trivially satisfied which follows from the fact that T A๐ and T B๐ are tree decompositions of the two clauses A๐ and B๐ and from C โ A๐ โช B๐. It remains to show validity of the third condition (connectedness).Let us assume (for contradiction) that there is a variable V โ vars(C) such that the vertices labelled by sets containing the variable V form a disconnected subgraph of T C . The variable V cannot be contained in vars(A) or vars(B). If v โ vars(A) then V โ vars(B) (because A and B were standardized apart) and also V โ vars(l i ๐) (because we selected the unifier ๐ to satisfy vars(l i ๐) โฉ vars(A) = โ ) but then the set of vertices labelled by sets containing the variable V could not be disconnected because it is actually connected in T A๐ and none of the labels in T B๐ contains V . The same argument can be used to show that V โ vars(B). So the only remaining possibility is that V โ vars(l i ๐). However, this is not possible either. Since both T A๐ and T B๐ are tree decompositions, the set of vertices labelled by the sets containing the variable V forms a connected subgraph in both T A๐ and T B๐ . Moreover, a vertex from T A๐ and a vertex from T B๐ which are both labelled by sets containing all variables from vars(l๐) are connected by an edge in T C therefore the set of vertices labelled by the sets containing the variable V must form a connected subgraph of T C . Thus, we have arrived at a contradiction because there cannot be any variable with a disconnected subgraph of T C associated to it.
We have verified that T C is a tree decomposition of C with width at most max{k A , k B }. โก
Lemma 8
Let X k be the set of all function-free Horn clauses with treewidth at most k. Then for any clause C derivable by SLD resolution from a clausal theory \(\mathcal {H} \in 2^{X_{k}}\) it holds C โ X k .
Proof
Since this lemma considers only SLD resolution, we can consider just the case of binary resolvents (we do not need to take factors into account). The proposition then follows immediately from Lemma 7 because any clause derived by applying the binary resolution rule on two clauses must always have treewidth bounded by the treewidth of the clauses from which it was derived. โก
Proposition 14
Let X k be the set of all function-free Horn clauses with treewidth at most k and let \(\mathcal {L} = 2^{X_{k}}\) be the set of theories consisting of function-free Horn clauses with treewidth at most k. Then any two clauses which are x-equivalent w.r.t. X k are also safely equivalent w.r.t. โ.
Proof
Follows directly from Lemmas 4 and 8. โก
Rights and permissions
About this article
Cite this article
Kuลพelka, O., Szabรณovรก, A. & ลฝeleznรฝ, F. A method for reduction of examples in relational learning. J Intell Inf Syst 42, 255โ281 (2014). https://doi.org/10.1007/s10844-013-0294-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-013-0294-z