The noisy subsequence tree recognition problem

  • B. J. Oommen
  • R. K. S. Loke
Structural Matching and Grammatical Inference
Part of the Lecture Notes in Computer Science book series (LNCS, volume 1451)

Abstract

In this paper we consider the problem of recognizing ordered labeled trees by processing their noisy subsequence-trees which are “patched-up” noisy portions of their fragments. We assume that we are given H, a finite dictionary of ordered labeled trees. X* is an unknown element of H, and U is any arbitrary subsequence-tree of X*. We consider the problem of estimating X* by processing Y — a noisy version of U. We do this by sequentially comparing Y with every element X of H, the basis of comparison being the constrained edit distance between two trees [OL94], where the constraint implicitly captures the properties of the corrupting mechanism (“channel”) which noisily garbles U into Y. Experimental results which involve manually constructed trees of sizes between 25 and 35 nodes and which contain an average of 21.8 errors per tree demonstrate that the scheme has about 92.8% accuracy. Similar experiments for randomly generated trees yielded an accuracy of 86.4%. To our knowledge this is the first reported solution to the problem.

Keywords

Edit Operation Label Tree Noisy Version Unordered Tree Deletion Error 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

  1. [Lu79]
    S. Y. Lu, “A tree-to-tree distance and its application to cluster analysis”, IEEE Trans. Pattern Anal. and Mach. Intell., Vol. PAMI 1, No. 2: pp. 219–224 (1979).Google Scholar
  2. [Oo87]
    B. J. Oommen, “Recognition of noisy subsequences using constrained edit distances”, IEEE Trans. Pattern Anal. and Mach. Intell., Vol. PAMI 9, No. 5: pp. 676–685 (1987).Google Scholar
  3. [OK96]
    B. J. Oommen and R. L. Kashyap, “A formal theory for optimal and information theoretic syntactic pattern recognition”. (To appear in Pattern Recognition).Google Scholar
  4. [OL94]
    B. J. Oommen, and W. Lee, “Constrained Tree Editing”, Information Sciences, Vol. 77 No. 3,4: pp. 253–273 (1994).CrossRefGoogle Scholar
  5. [OL97]
    B. J. Oommen, and W. Lee, “Constrained Tree Editing”, Information Sciences, Vol. 77 No. 3,4: pp. 253–273 (1994).CrossRefGoogle Scholar
  6. [OZL98]
    B. J. Oommen and R. K. S Loke, “On the Recognition of Noisy Subsequence Trees”. Unabridged version of this paper.Google Scholar
  7. [SK83]
    D. Sankoff and J. B. Kruskal, Time wraps, string edits, and macromolecules: Theory and practice of sequence comparison, Addison-Wesley, (1983).Google Scholar
  8. [Se77]
    S. M. Selkow, “The tree-to-tree editing problem”, Inform. Proc. Let., Vol. 6, pp. 184–186 (1977).CrossRefGoogle Scholar
  9. [SZ90]
    B. Shapiro and K. Zhang, “Comparing multiple RNA secondary structures using tree comparisons”, Comput. Appl. Biosci. vol. 6, no. 4, 309–318 (1990).PubMedGoogle Scholar
  10. [Ta79]
    K. C. Tai, “The tree-to-tree correction problem”, J. Assoc. Comput. Mach., Vol. 26: pp. 422–433 (1979).Google Scholar
  11. [ZJ94]
    K. Zhang and T. Jiang, “Some MAX SNP-hard results concerning unordered labeled trees”, Information Processing Letters, 49, 249–254 (1994).CrossRefGoogle Scholar
  12. [ZS89]
    K. Zhang and D. Shasha, “Simple fast algorithms for the editing distance between trees and related problems”, SIAM J. Comput. Vol. 18, No. 6: pp. 1245–1262 (1989).CrossRefGoogle Scholar
  13. [ZSW92]
    K. Zhang, D. Shasha and J. T. L. Wang, “Fast serial and parallel approximate tree matching with VLDC's”, tProc. of the 1992 Symposium on Combinatorial Pattern Matching, CPM92, 148–161 (1992).Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1998

Authors and Affiliations

  • B. J. Oommen
    • 1
  • R. K. S. Loke
    • 1
  1. 1.School of Computer ScienceCarleton UniversityOttawaCanada

Personalised recommendations