A Polynomial Time Matching Algorithm of Ordered Tree Patterns Having Height-Constrained Variables

  • Kazuhide Aikou
  • Yusuke Suzuki
  • Takayoshi Shoudai
  • Tomoyuki Uchida
  • Tetsuhiro Miyahara
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3537)


Tree structured data such as HTML/XML files are represented by rooted trees with ordered children and edge labels. Knowledge representations for tree structured data are quite important to discover interesting features which such tree structured data have. In order to represent tree structured patterns with rich structural features, we introduce a new type of structured variables, called height-constrained variables. An (i,j)-height-constrained variable can be replaced with any tree such that the trunk length of the tree is at least i and the height of the tree is at most j. Then, we define a term tree as a rooted tree structured pattern with ordered children and height-constrained variables. In this paper, given a term tree t and an ordered tree T, we present an \(O(N\max\{nD_{\max},{\cal S}\})\) time algorithm of deciding whether or not t matches T, where D max is the maximum number of the children of an internal vertex in T, \({\cal S}\) is the sum of all trunk length constraints i of all (i,j)-height-constrained variables in t, and n and N are the numbers of vertices of t and T, respectively.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Abiteboul, S., Buneman, P., Suciu, D.: Data on the Web: From Relations to Semistructured Data and XML. Morgan Kaufmann, San Francisco (2000)Google Scholar
  2. 2.
    Aikou, K., Suzuki, Y., Shoudai, T., Miyahara, T.: Automatic Wrapper Generation for Metasearch using Ordered Tree Structured Patterns. In: Webb, G.I., Yu, X. (eds.) AI 2004. LNCS (LNAI), vol. 3339, pp. 1030–1035. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  3. 3.
    Amoth, T.R., Cull, P., Tadepalli, P.: On exact learning of unordered tree patterns. Machine Learning 44, 211–243 (2001)zbMATHCrossRefGoogle Scholar
  4. 4.
    Arimura, H., Sakamoto, H., Arikawa, S.: Efficient learning of semi-structured data from queries. In: Abe, N., Khardon, R., Zeugmann, T. (eds.) ALT 2001. LNCS (LNAI), vol. 2225, pp. 315–331. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  5. 5.
    Asai, T., Arimura, H., Uno, T., Nakano, S.: Discovery of frequent substructures in large unordered trees. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds.) DS 2003. LNCS (LNAI), vol. 2843, pp. 47–61. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  6. 6.
    Fernandez, M., Suciu, D.: Optimizing regular path expressions using graph schemas. In: Proceedings of the 14th International Conference on Data Engineering (ICDE 1998), pp. 14–23. IEEE Computer Society, Los Alamitos (1998)CrossRefGoogle Scholar
  7. 7.
    Matsumoto, S., Hayashi, Y., Shoudai, T.: Polynomial time inductive inference of regular term tree languages from positive data. In: Li, M. (ed.) ALT 1997. LNCS, vol. 1316, pp. 212–227. Springer, Heidelberg (1997)Google Scholar
  8. 8.
    Matsumoto, S., Shoudai, T.: Learning of Ordered Tree Languages with Height- Bounded Variables Using Queries. In: Ben-David, S., Case, J., Maruoka, A. (eds.) ALT 2004. LNCS (LNAI), vol. 3244, pp. 425–439. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  9. 9.
    Miyahara, T., Suzuki, Y., Shoudai, T., Uchida, T., Takahashi, K., Ueda, H.: Discovery of Maximally Frequent Tag Tree Patterns with Contractible Variables from Semistructured Documents. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 133–144. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  10. 10.
    Shoudai, T., Uchida, T., Miyahara, T.: Polynomial time algorithms for finding unordered term tree patterns with internal variables. In: Freivalds, R. (ed.) FCT 2001. LNCS, vol. 2138, pp. 335–346. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  11. 11.
    Suzuki, Y., Akanuma, R., Shoudai, T., Miyahara, T., Uchida, T.: Polynomial time inductive inference of ordered tree patterns with internal structured variables from positive data. In: Kivinen, J., Sloan, R.H. (eds.) COLT 2002. LNCS (LNAI), vol. 2375, pp. 169–184. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  12. 12.
    Suzuki, Y., Shoudai, T., Miyahara, T., Uchida, T.: Ordered Term Tree Languages Which Are Polynomial Time Inductively Inferable from Positive Data. In: Cesa-Bianchi, N., Numao, M., Reischuk, R. (eds.) ALT 2002. LNCS (LNAI), vol. 2533, pp. 188–202. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  13. 13.
    Suzuki, Y., Inomae, K., Shoudai, T., Miyahara, T., Uchida, T.: A Polynomial Time Matching Algorithm of Structured Ordered Tree Patterns for Data Mining from Semistructured Data. In: Matwin, S., Sammut, C. (eds.) ILP 2002. LNCS (LNAI), vol. 2583, pp. 270–284. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  14. 14.
    Wang, K., Liu, H.: Discovering structural association of semistructured data. IEEE Trans. Knowledge and Data Engineering 12, 353–371 (2000)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2005

Authors and Affiliations

  • Kazuhide Aikou
    • 1
  • Yusuke Suzuki
    • 1
    • 2
  • Takayoshi Shoudai
    • 1
  • Tomoyuki Uchida
    • 2
  • Tetsuhiro Miyahara
    • 2
  1. 1.Department of InformaticsKyushu UniversityKasugaJapan
  2. 2.Faculty of Information SciencesHiroshima City UniversityHiroshimaJapan

Personalised recommendations