Advertisement

Top-Down Tree Edit-Distance of Regular Tree Languages

  • Sang-Ki Ko
  • Yo-Sub Han
  • Kai Salomaa
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8370)

Abstract

We study the edit-distance of regular tree languages. The edit-distance is a metric for measuring the similarity or dissimilarity between two objects, and a regular tree language is a set of trees accepted by a finite-state tree automaton or described by a regular tree grammar. Given two regular tree languages L and R, we define the edit-distance d(L,R) between L and R to be the minimum edit-distance between a tree t 1 ∈ L and t 2 ∈ R, respectively. Based on tree automata for L and R, we present a polynomial algorithm that computes d(L,R). We also suggest how to use the edit-distance between two tree languages for identifying a special common string between two context-free grammars.

Keywords

tree edit-distance regular tree languages tree automata dynamic programming 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Bunke, H.: Edit distance of regular languages. In: Proceedings of the 5th Annual Symposium on Document Analysis and Information Retrieval, pp. 113–124 (1996)Google Scholar
  2. 2.
    Chawathe, S.S.: Comparing hierarchical data in external memory. In: Proceedings of the 25th International Conference on Very Large Data Bases, pp. 90–101 (1999)Google Scholar
  3. 3.
    Choffrut, C., Pighizzini, G.: Distances between languages and reflexivity of relations. Theoretical Compututer Science 286(1), 117–138 (2002)CrossRefzbMATHMathSciNetGoogle Scholar
  4. 4.
    Comon, H., Dauchet, M., Gilleron, R., Löding, C., Jacquemard, F., Lugiez, D., Tison, S., Tommasi, M.: Tree automata techniques and applications (2007), http://www.grappa.univ-lille3.fr/tata (release October 12, 2007)
  5. 5.
    Demaine, E.D., Mozes, S., Rossman, B., Weimann, O.: An optimal decomposition algorithm for tree edit distance. ACM Transactions on Algorithms 6(1), 2:1–2:19 (2009)Google Scholar
  6. 6.
    Gécseg, F., Steinby, M.: Tree languages. In: Handbook of Formal Languages, Vol. 3: Beyond Words, pp. 1–68. Springer-Verlag New York, Inc. (1997)Google Scholar
  7. 7.
    Hamming, R.W.: Error Detecting and Error Correcting Codes. Bell System Technical Journal 26(2), 147–160 (1950)CrossRefMathSciNetGoogle Scholar
  8. 8.
    Han, Y.-S., Ko, S.-K., Salomaa, K.: Computing the edit-distance between a regular language and a context-free language. In: Yen, H.-C., Ibarra, O.H. (eds.) DLT 2012. LNCS, vol. 7410, pp. 85–96. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  9. 9.
    Han, Y.-S., Ko, S.-K., Salomaa, K.: Approximate matching between a context-free grammar and a finite-state automaton. In: Konstantinidis, S. (ed.) CIAA 2013. LNCS, vol. 7982, pp. 146–157. Springer, Heidelberg (2013)Google Scholar
  10. 10.
    Klein, P.N.: Computing the edit-distance between unrooted ordered trees. In: Proceedings of the 6th Annual European Symposium on Algorithms, pp. 91–102 (1998)Google Scholar
  11. 11.
    Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10(8), 707–710 (1966)MathSciNetGoogle Scholar
  12. 12.
    Mohri, M.: Edit-distance of weighted automata: General definitions and algorithms. International Journal of Foundations of Computer Science 14(6), 957–982 (2003)CrossRefzbMATHMathSciNetGoogle Scholar
  13. 13.
    Myers, G.: Approximately matching context-free languages. Information Processing Letters 54, 85–92 (1995)CrossRefzbMATHMathSciNetGoogle Scholar
  14. 14.
    Nierman, A., Jagadish, H.V.: Evaluating structural similarity in XML documents. In: Proceedings of the 5th International Workshop on the Web and Databases, pp. 61–66 (2002)Google Scholar
  15. 15.
    Reis, D.C., Golgher, P.B., Silva, A.S., Laender, A.F.: Automatic web news extraction using tree edit distance. In: Proceedings of the 13th International Conference on World Wide Web, pp. 502–511 (2004)Google Scholar
  16. 16.
    Selkow, S.: The tree-to-tree editing problem. Information Processing Letters 6(6), 184–186 (1977)CrossRefzbMATHMathSciNetGoogle Scholar
  17. 17.
    Tai, K.C.: The tree-to-tree correction problem. Journal of the ACM 26(3), 422–433 (1979)CrossRefzbMATHMathSciNetGoogle Scholar
  18. 18.
    Tekli, J., Chbeir, R., Yetongnon, K.: Survey: An overview on XML similarity: Background, current trends and future directions. Computer Science Review 3(3), 151–173 (2009)CrossRefGoogle Scholar
  19. 19.
    Wagner, R.A.: Order-n correction for regular languages. Communications of the ACM 17, 265–268 (1974)CrossRefzbMATHGoogle Scholar
  20. 20.
    Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. Journal of the ACM 21, 168–173 (1974)CrossRefzbMATHMathSciNetGoogle Scholar
  21. 21.
    Winkler, W.E.: String comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage. In: Proceedings of the Section on Survey Research, pp. 354–359 (1990)Google Scholar
  22. 22.
    Yang, R., Kalnis, P., Tung, A.K.H.: Similarity evaluation on tree-structured data. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp. 754–765 (2005)Google Scholar
  23. 23.
    Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM Journal on Computing 18(6), 1245–1262 (1989)CrossRefzbMATHMathSciNetGoogle Scholar
  24. 24.
    Zhang, Z., Cao, R.L.S., Zhu, Y.: Similarity metric for XML documents. In: Proceedings of Workshop on Knowledge and Experience Management (2003)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Sang-Ki Ko
    • 1
  • Yo-Sub Han
    • 1
  • Kai Salomaa
    • 2
  1. 1.Department of Computer ScienceYonsei UniversitySeoulRepublic of Korea
  2. 2.School of ComputingQueen’s UniversityKingstonCanada

Personalised recommendations