# Polynomial Time Inductive Inference of Ordered Tree Patterns with Internal Structured Variables from Positive Data

## Abstract

Tree structured data such as HTML/XML files are represented by rooted trees with ordered children and edge labels. As a representation of a tree structured pattern in such tree structured data, we propose an ordered tree pattern, called a term tree, which is a rooted tree pattern consisting of ordered children and internal structured variables. A term tree is a generalization of standard tree patterns representing first order terms in formal logic. For a set of edge labels *Λ* and a term tree *t*, the term tree language of *t*, denoted by *L* _{Λ}(*t*), is the set of all labeled trees which are obtained from a term tree *t* by substituting arbitrary labeled trees for all variables in *t*. In this paper, we propose polynomial time algorithms for the following two problems for two fundamental classes of term trees. The membership problem is, given a term tree *t* and a tree *T*, to decide whether or not *L* _{Λ}(*t*) includes *T*. The minimal language problem is, given a set of labeled trees *S*, to find a term tree *t* such that *L* _{Λ}(*t*) is minimal among all term tree languages which contain all trees in *S*. Then, by using these two algorithms, we show that the two classes of term trees are polynomial time inductively inferable from positive data.

## Preview

Unable to display preview. Download preview PDF.

## References

- 1.S. Abiteboul, P. Buneman, and D. Suciu.
*Data on the Web: From Relations to Semistructured Data and XML*. Morgan Kaufmann, 2000.Google Scholar - 2.T. R. Amoth, P. Cull, and P. Tadepalli. Exact learning of unordered tree patterns from queries.
*Proc. COLT-99*,*ACM Press*, pages 323–332, 1999.Google Scholar - 3.D. Angluin. Finding patterns common to a set of strings.
*Journal of Computer and System Science*, 21:46–62, 1980.zbMATHCrossRefMathSciNetGoogle Scholar - 4.H. Arimura, H. Sakamoto, and S. Arikawa. Efficient learning of semi-structured data from queries.
*Proc. ALT-2001*,*Springer-Verlag*,*LNAI 2225*, pages 315–331, 2001.Google Scholar - 5.H. Arimura, T. Shinohara, and S. Otsuki. Finding minimal generalizations for unions of pattern languages and its application to inductive inference from positive data.
*Proc. STACS-94*,*Springer-Verlag*,*LNCS 775*, pages 649–660, 1994.Google Scholar - 6.S. Matsumoto, Y. Hayashi, and T. Shoudai. Polynomial time inductive inference of regular term tree languages from positive data.
*Proc. ALT-97*,*Springer-Verlag*,*LNAI 1316*, pages 212–227, 1997.Google Scholar - 7.T. Miyahara, T. Shoudai, T. Uchida, K. Takahashi, and H. Ueda. Polynomial time matching algorithms for tree-like structured patterns in knowledge discovery.
*Proc. PAKDD-2000*,*Springer-Verlag*,*LNAI 1805*, pages 5–16, 2000.Google Scholar - 8.T. Miyahara, T. Shoudai, T. Uchida, K. Takahashi, and H. Ueda. Discovery of frequent tree structured patterns in semistructured web documents.
*Proc. PAKDD-2001*,*Springer-Verlag*,*LNAI 2035*, pages 47–52, 2001.Google Scholar - 9.T. Miyahara, Y. Suzuki, T. Shoudai, T. Uchida, K. Takahashi, and H. Ueda. Discovery of frequent tag tree patterns in semistructured web documents.
*Proc. PAKDD-2002*,*Springer-Verlag*,*LNAI (to appear)*, 2002.Google Scholar - 10.T. Shinohara. Polynomial time inference of extended regular pattern languages. In
*Springer-Verlag*,*LNCS 147*, pages 115–127, 1982.Google Scholar - 11.T. Shinohara and S. Arikawa. Pattern inference.
*GOSLER Final Report*,*Springer-Verlag*,*LNAI 961*, pages 259–291, 1995.Google Scholar - 12.T. Shoudai, T. Miyahara, T. Uchida, and S. Matsumoto. Inductive inference of regular term tree languages and its application to knowledge discovery.
*Information Modeling and Knowledge Bases XI*,*IOS Press*, pages 85–102, 2000.Google Scholar - 13.T. Shoudai, T. Uchida, and T. Miyahara. Polynomial time algorithms for finding unordered tree patterns with internal variables.
*Proc. FCT-2001*,*Springer-Verlag*,*LNCS 2138*, pages 335–346, 2001.Google Scholar - 14.Y. Suzuki, T. Shoudai, T. Miyahara, and T. Uchida. Polynomial time inductive inference of ordered tree patterns with internal structured variables from positive data.
*Tech. Rep. Japanese Society for Artificial Intelligence, SIG-FAI-A104*, pages 71–78, 2002.Google Scholar