Advertisement

Fast Algorithms for Computing Tree LCS

  • Shay Mozes
  • Dekel Tsur
  • Oren Weimann
  • Michal Ziv-Ukelson
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5029)

Abstract

The LCS of two rooted, ordered, and labeled trees F and G is the largest forest that can be obtained from both trees by deleting nodes. We present algorithms for computing tree LCS which exploit the sparsity inherent to the tree LCS problem. Assuming G is smaller than F, our first algorithm runs in time \(O(r\cdot {\rm height}(F) \cdot {\rm height}(G)\cdot \lg\lg |G|)\), where r is the number of pairs (v ∈ F, w ∈ G) such that v and w have the same label. Our second algorithm runs in time \(O(L r \lg r \cdot \lg\lg|G|)\), where L is the size of the LCS of F and G. For this algorithm we present a novel three dimensional alignment graph. Our third algorithm is intended for the constrained variant of the problem in which only nodes with zero or one children can be deleted. For this case we obtain an \(O(r h \lg \lg|G|)\) time algorithm, where h = height(F) + height(G).

Keywords

Match Pair Main Path Longe Common Subsequence Path Decomposition Longe Common Subsequence 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Abouelhoda, M.I., Ohlebusch, E.: Chaining algorithms for multiple genome comparison. J. of Discrete Algorithms 3(2-4), 321–341 (2005)zbMATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Amir, A., Hartman, T., Kapah, O., Shalom, B.R., Tsur, D.: Generalized LCS. In: Ziviani, N., Baeza-Yates, R. (eds.) SPIRE 2007. LNCS, vol. 4726, pp. 50–61. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  3. 3.
    Apostolico, A., Guerra, C.: The longest common subsequence problem revisited. Algorithmica 2, 315–336 (1987)zbMATHCrossRefMathSciNetGoogle Scholar
  4. 4.
    Backofen, R., Hermelin, D., Landau, G.M., Weimann, O.: Normalized similarity of RNA sequences. In: Proc. 12th symposium on String Processing and Information Retrieval (SPIRE), pp. 360–369 (2005)Google Scholar
  5. 5.
    Backofen, R., Hermelin, D., Landau, G.M., Weimann, O.: Local alignment of RNA sequences with arbitrary scoring schemes. In: Lewenstein, M., Valiente, G. (eds.) CPM 2006. LNCS, vol. 4009, pp. 246–257. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  6. 6.
    Bille, P.: A survey on tree edit distance and related problems. Theoretical computer science 337, 217–239 (2005)zbMATHCrossRefMathSciNetGoogle Scholar
  7. 7.
    Bille, P.: Pattern Matching in Trees and Strings. PhD thesis, ITU University of Copenhagen (2007)Google Scholar
  8. 8.
    Chawathe, S.: Comparing hierarchical data in external memory. In: Proc. 25th International Conference on Very Large Data Bases, Edinburgh, Scotland, U.K, pp. 90–101 (1999)Google Scholar
  9. 9.
    Chen, W.: New algorithm for ordered tree-to-tree correction problem. J. of Algorithms 40, 135–158 (2001)zbMATHCrossRefGoogle Scholar
  10. 10.
    Chin, F.Y.L., Poon, C.K.: A fast algorithm for computing longest common subsequences of small alphabet size. J. of Information Processing 13(4), 463–469 (1990)zbMATHGoogle Scholar
  11. 11.
    Crochemore, M., Landau, G.M., Ziv-Ukelson, M.: A subquadratic sequence alignment algorithm for unrestricted scoring matrices. SIAM J. on Computing 32, 1654–1673 (2003)zbMATHCrossRefMathSciNetGoogle Scholar
  12. 12.
    Demaine, E.D., Mozes, S., Rossman, B., Weimann, O.: An optimal decomposition algorithm for tree edit distance. In: Arge, L., Cachin, C., Jurdziński, T., Tarlecki, A. (eds.) ICALP 2007. LNCS, vol. 4596, pp. 146–157. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  13. 13.
    Eppstein, D., Galil, Z., Giancarlo, R., Italiano, G.F.: Sparse dynamic programming i: linear cost functions. J. of the ACM 39(3), 519–545 (1992)zbMATHCrossRefMathSciNetGoogle Scholar
  14. 14.
    Harel, D., Tarjan, R.E.: Fast algorithms for finding nearest common ancestors. SIAM J. of Computing 13(2), 338–355 (1984)zbMATHCrossRefMathSciNetGoogle Scholar
  15. 15.
    Hirschberg, D.S.: A linear space algorithm for computing maximal common subsequences. Com. ACM 18(6), 341–343 (1975)zbMATHCrossRefMathSciNetGoogle Scholar
  16. 16.
    Hirschberg, D.S.: Algorithms for the longest common subsequence problem. J. of the ACM 24(4), 664–675 (1977)zbMATHCrossRefMathSciNetGoogle Scholar
  17. 17.
    Hsu, W.J., Du, M.W.: New algorithms for the LCS problem. J. of Computer and System Sciences 29(2), 133–152 (1984)zbMATHCrossRefMathSciNetGoogle Scholar
  18. 18.
    Hunt, J.W., Szymanski, T.G.: A fast algorithm for computing longest common subsequences. Commun. ACM 20(5), 350–353 (1977)zbMATHCrossRefMathSciNetGoogle Scholar
  19. 19.
    Klein, P.N.: Computing the edit-distance between unrooted ordered trees. In: Bilardi, G., Pietracaprina, A., Italiano, G.F., Pucci, G. (eds.) ESA 1998. LNCS, vol. 1461, pp. 91–102. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  20. 20.
    Klein, P.N., Tirthapura, S., Sharvit, D., Kimia, B.B.: A tree-edit-distance algorithm for comparing simple, closed shapes. In: Proc. 11th ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 696–704 (2000)Google Scholar
  21. 21.
    Levenstein, V.I.: Binary codes capable of correcting insetrions and reversals. Sov. Phys. Dokl. 10, 707–719 (1966)MathSciNetGoogle Scholar
  22. 22.
    Lozano, A., Valiente, G.: On the maximum common embedded subtree problem for ordered trees. In: Iliopoulos, C.S., Lecroq, T. (eds.) String Algorithmics, pp. 155–170. King’s College Publications (2004)Google Scholar
  23. 23.
    Masek, W.J., Paterson, M.S.: A faster algorithm computing string edit distances. J. of Computer and System Sciences 20(1), 18–31 (1980)zbMATHCrossRefMathSciNetGoogle Scholar
  24. 24.
    Myers, G., Miller, W.: Chaining multiple-alignment fragments in sub-quadratic time. In: Proc. 6th annual ACM-SIAM symposium on Discrete algorithms (SODA), pp. 38–47 (1995)Google Scholar
  25. 25.
    Rick, C.: Simple and fast linear space computation of longest common subsequences. Information Processing Letters 75(6), 275–281 (2000)zbMATHCrossRefMathSciNetGoogle Scholar
  26. 26.
    Tai, K.: The tree-to-tree correction problem. J. of the ACM 26(3), 422–433 (1979)zbMATHCrossRefMathSciNetGoogle Scholar
  27. 27.
    Touzet, H.: A linear tree edit distance algorithm for similar ordered trees. In: Apostolico, A., Crochemore, M., Park, K. (eds.) CPM 2005. LNCS, vol. 3537, pp. 334–345. Springer, Heidelberg (2005)Google Scholar
  28. 28.
    van Emde Boas, P.: Preserving order in a forest in less than logarithmic time and linear space. Information Processing Letters 6(3), 80–82 (1977)zbMATHCrossRefGoogle Scholar
  29. 29.
    Wagner, R.A., Fischer, M.J.: The string-to-string correction problem. J. of the ACM 21(1), 168–173 (1974)zbMATHCrossRefMathSciNetGoogle Scholar
  30. 30.
    Zhang, K.: Algorithms for the constrained editing distance between ordered labeled trees and related problems. Pattern Recognition 28(3), 463–474 (1995)CrossRefGoogle Scholar
  31. 31.
    Zhang, K., Shasha, D.: Simple fast algorithms for the editing distance between trees and related problems. SIAM J. of Computing 18(6), 1245–1262 (1989)zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Shay Mozes
    • 1
  • Dekel Tsur
    • 2
  • Oren Weimann
    • 3
  • Michal Ziv-Ukelson
    • 2
  1. 1.Brown UniversityProvidenceUSA
  2. 2.Ben-Gurion UniversityBeer-ShevaIsrael
  3. 3.Massachusetts Institute of TechnologyCambridgeUSA

Personalised recommendations