Handbook of Combinatorial Optimization pp 781-822 | Cite as

# Computing Distances between Evolutionary Trees

## Abstract

Comparing objects to find their *similarities* or, equivalently, *dissimilarities*, is a fundamental issue in many fields including pattern recognition, image analysis, drug design, the study of thermodynamic costs of computing, cognitive science, *etc*. Various models have been introduced to measure the degree of similarity or dissimilarity in the literature. In the latter case the degree of dissimilarity is also often referred to as the *distance*. While some distances are straightforward to compute, *e.g.* the Hamming distance for binary strings, the Euclidean distance for geometric objects; some others are formulated as combinatorial optimization problems and thus pose nontrivial challenging algorithmic problems, sometimes even uncomputable, such as the universal information distance between two objects [4].

## Keywords

Leaf Node Binary Tree Evolutionary Tree Internal Edge Weighted Tree## Preview

Unable to display preview. Download preview PDF.

## References

- [1]D. Aldous, Triangulating the circle, at random.
*Amer Math. Monthly*, 89, pp. 223–234, 1994.CrossRefMathSciNetGoogle Scholar - [2]M.A. Armstrong,
*Groups and Symmetry*, Springer Verlag, New York Inc., 1988.zbMATHGoogle Scholar - [3]D. Barry and J.A. Hartigan, Statistical analysis of hominoid molecular evolution,
*Stat. Sci.*, 2, pp. 191–210, 1987.CrossRefMathSciNetGoogle Scholar - [4]C.H. Bennett, P. Gács, M. Li, P. Vitányi, and W. Zurek, Information Distance, to appear in IEEE Trans. Inform. Theory.Google Scholar
- [5]R. P. Boland, E. K. Brown and W. H. E. Day, Approximating minimumlength-sequence metrics: a cautionary note,
*Math. Soc. Sci.*, 4, pp. 261–270, 1983.CrossRefzbMATHMathSciNetGoogle Scholar - [6]K. Culik II and D. Wood, A note on some tree similarity measures,
*Inform. Proc. Let.*, 15, pp. 39–42, 1982.CrossRefzbMATHMathSciNetGoogle Scholar - B. DasGupta, X. He, T. Jiang, M. Li, J. Tromp and L. Zhang, On distances between phylogenetic trees,
*Proc. 8th Annual ACM-SIAM Symposium on Discrete Algorithms*, pp. 427–436, 1997.Google Scholar - [8]B. DasGupta, X. He, T. Jiang, M. Li, and J. Tromp, On the linear-cost subtree-transfer distance,
*Algorithmica*, submitted, 1997.Google Scholar - [9]B. DasGupta, X. He, T. Jiang, M. Li, J. Tromp, and L. Zhang, On computing the nearest neighbor interchange distance,
*Preprint*, 1997.Google Scholar - [10]W. H. E. Day, Properties of the nearest neighbor interchange metric for trees of small size,
*Journal of Theoretical Biology*,*101*, pp. 275–288, 1983.CrossRefMathSciNetGoogle Scholar - [11]A. K. Dewdney, Wagner’s theorem for torus graphs,
*Discrete Math.*, 4, pp. 139–149, 1973.CrossRefzbMATHMathSciNetGoogle Scholar - [12]A.W.F. Edwards and L.L. Cavalli-Sforza, The reconstruction of evolution,
*Ann. Hum. Genet.*, 27, 105, 1964. (Also in*Heredity*18, 553.)Google Scholar - [13]J. Felsenstein, Evolutionary trees for DNA sequences: a maximum likelihood approach.
*J. Mol. Evol.*, 17, pp. 368–376, 1981.CrossRefGoogle Scholar - [14]J. Felsenstein, personal communication, 1996.Google Scholar
- [15]W.M. Fitch, Toward defining the course of evolution: minimum change for a specified tree topology,
*Syst. Zool.*, 20, pp. 406–416, 1971.CrossRefGoogle Scholar - [16]W.M. Fitch and E. Margoliash, Construction of phylogenetic trees,
*Science*, 155, pp. 279–284, 1967.CrossRefGoogle Scholar - [17]M. R. Garey and D. S. Johnson,
*Computers and Intractability: A Guide to the Theory of NP-Completeness*, W. H. Freeman, 1979.zbMATHGoogle Scholar - [18]L. Guibas and J. Hershberger, Morphing simple polygons,
*Proceeding of the ACM 10th Annual Sym. of Comput. Geometry*, pp. 267–276, 1994.Google Scholar - [19]J. Hein, Reconstructing evolution of sequences subject to recombination using parsimony,
*Math. Biosci.*, 98, pp. 185–200, 1990.CrossRefzbMATHMathSciNetGoogle Scholar - [20]J. Hein, A heuristic method to reconstruct the history of sequences subject to recombination,
*J. Mol. Evol.*, 36, pp. 396–405, 1993.CrossRefGoogle Scholar - [21]J. Hein, personal email communication, 1996.Google Scholar
- [22]J. Hein, T. Jiang, L. Wang, and K. Zhang, On the complexity of comparing evolutionary trees,
*Discrete Applied Mathematics*, 71, pp. 153–169, 1996.CrossRefzbMATHMathSciNetGoogle Scholar - [23]J. Hershberger and S. Suri, Morphing binary trees.
*Proceeding of the ACM-SIAM 6th Annual Symposium of Discrete Algorithms*, pp. 396–404, 1995.Google Scholar - [24]F. Hurtado, M. Noy, and J. Urrutia, Flipping edges in triangulations,
*Proc. of the ACM 12th Annual Sym. of Comput. Geometry*, pp. 214–223, 1996.Google Scholar - [25]J. P. Jarvis, J. K. Luedeman and D. R. Shier, Counterexamples in measuring the distance between binary trees,
*Mathematical Social Sciences*, 4, pp. 271–274, 1983.CrossRefzbMATHGoogle Scholar - [26]J. P. Jarvis, J. K. Luedeman and D. R. Shier, Comments on computing the similarity of binary trees,
*Journal of Theoretical Biology*, 100, pp. 427–433, 1983.CrossRefMathSciNetGoogle Scholar - [27]J. Kececioglu and D. Gusfield, Reconstructing a history of recombinations from a set of sequences,
*Proc. 5th Annual ACM-SIAM Symp. Discrete Algorithms*, 1994.Google Scholar - [28]M. Kuhner and J. Felsenstein, A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates.
*Mol. Biol. Evol.*11 (3), pp. 459–468, 1994.Google Scholar - [29]M. Krivânek, Computing the nearest neighbor interchange metric for unlabeled binary trees is NP-complete,
*Journal of Classification*, 3, pp. 55–60, 1986.CrossRefzbMATHMathSciNetGoogle Scholar - [30]V. King and T. Warnow, On Measuring the nni distance between two evolutionary trees,
*DIMACS mini workshop on combinatorial structures in molecular biology*, Rutgers University, Nov 4, 1994.Google Scholar - [31]S. Khuller, Open Problems: 10,
*SIGACT News*, 24 (4), p. 46, Dec 1994.MathSciNetGoogle Scholar - [32]W.J. Le Quesne, The uniquely evolved character concept and its cladistic application,
*Syst. Zool.*, 23, pp. 513–517, 1974.CrossRefGoogle Scholar - [33]M. Li, J. Tromp, and L.X. Zhang, On the nearest neighbor interchange distance between evolutionary trees,
*Journal of Theoretical Biology*, 182, pp. 463–467, 1996.CrossRefGoogle Scholar - [34]M. Li and L. Zhang, Better Approximation of Diagonal-Flip Transformation and Rotation Transformation, Manuscript, 1997.Google Scholar
- [35]G. W. Moore, M. Goodman and J. Barnabas, An iterative approach from the standpoint of the additive hypothesis to the dendrogram problem posed by molecular data sets,
*Journal of Theoretical Biology*, 38, pp. 423–457, 1973.CrossRefGoogle Scholar - [36]J. Pallo, On rotation distance in the lattice of binary trees,
*Infor. Proc. Letters*, 25, pp. 369–373, 1987.CrossRefMathSciNetGoogle Scholar - [37]D. F. Robinson, Comparison of labeled trees with valency three,
*Journal of Combinatorial Theory,Series B*, 11, pp. 105–119, 1971.CrossRefMathSciNetGoogle Scholar - [38]N. Saitou and M. Nei, The neighbor-joining method: a new method for reconstructing phylogenetic trees,
*Mol. Biol. Evol.*, 4, pp. 406–425, 1987.Google Scholar - [39]D. Sankoff, Minimal mutation trees of sequences,
*SIAM J. Appl. Math.*, 28, pp. 35–42, 1975.CrossRefzbMATHMathSciNetGoogle Scholar - [40]D. Sankoff and J. Kruskal (Eds),
*Time Warps, String Edits, and Macromolecules: the Theory and Practice of Sequence Comparison*, Addison Wesley, Reading Mass., 1983.Google Scholar - [41]D. Sleator, R. Tarjan, W. Thurston, Rotation distance, triangulations, and hyperbolic geometry,
*J. Amer. Math. Soc.*, 1, pp. 647–681, 1988.CrossRefzbMATHMathSciNetGoogle Scholar - [42]D. Sleator, R. Tarjan, W. Thurston, Short encodings of evolving structures,
*SIAM J. Discr. Math.*, 5, pp. 428–450, 1992.CrossRefzbMATHMathSciNetGoogle Scholar - [43]K.C. Tai, The tree-to-tree correction problem,
*J. ACM*, 26, pp. 422–433, 1979.CrossRefzbMATHMathSciNetGoogle Scholar - [44]A. von Haseler and G.A. Churchill, Network models for sequence evolution,
*J. Mol. Evol.*, 37, pp. 77–85, 1993.Google Scholar - [45]K. Wagner, Bemerkungen zum vierfarbenproblem,
*J. Deutschen Math.Verin.*, 46, pp. 26–32, 1936.Google Scholar - [46]M. S. Waterman,
*Introduction to computational biology: maps*, sequences and genomes, Chapman Sc Hall, 1995.zbMATHGoogle Scholar - [47]M. S. Waterman and T. F. Smith, On the similarity of dendrograms,
*Journal of Theoretical Biology*, 73, pp. 789–800, 1978.CrossRefMathSciNetGoogle Scholar - [48]K. Zhang and D. Shasha, Simple fast algorithms for the editing distance between trees and related problems,
*SIAM J. Comput.*18, pp. 12451–262, 1989.CrossRefMathSciNetGoogle Scholar - [49]K. Zhang, J. Wang and D. Sasha, On the editing distance between undirected acyclic graphs, International J. of Foundations of Computer Science 7 (13), March 1996.Google Scholar