Advertisement

Bulletin of Mathematical Biology

, Volume 80, Issue 3, pp 493–518 | Cite as

New Gromov-Inspired Metrics on Phylogenetic Tree Space

Original Article

Abstract

We present a new class of metrics for unrooted phylogenetic X-trees inspired by the Gromov–Hausdorff distance for (compact) metric spaces. These metrics can be efficiently computed by linear or quadratic programming. They are robust under NNI operations, too. The local behaviour of the metrics shows that they are different from any previously introduced metrics. The performance of the metrics is briefly analysed on random weighted and unweighted trees as well as random caterpillars.

Keywords

Tree space Phylogenetic distance Caterpillars Gromov–Hausdorff metric Mathematical programming 

Notes

Acknowledgements

First of all, I have to thank Mareike Fischer for introducing me to the world of phylogenetic distances. She helped also a lot for getting a clear notation. Second, I’m very grateful to Jürgen Eichhorn who unconsciously draw my attention to metrics between metric spaces. Third, I’d like to thank Michelle Kendall for her inspiring talk at the Portobello conference 2015 and additional discussion later. Fourth, I thank Mike Steel for many interesting discussions, useful hints, his kind hospitality during my stay in Christchurch 2010, and for the organisation of the amazing 2015 workshop in Kaikoura with an inspiring and open atmosphere. Further, Miroslav Bačak, Andrew Francis, Alexander Gavryushkin, Stefan Grünewald, Marc Hellmuth and Giulio dalla Riva gave useful hints and inspiration in many discussions. The questions and hints of five anonymous referees regarding previous versions of this manuscript helped to improve it substantially.

References

  1. Agarwal PK, Fox K, Nath A, Sidiropoulos A, Wang Y (2015) Computing the Gromov–Hausdorff distance for metric trees. In: Elbassioni K, Makino K (eds) Algorithms and computation. Lecture Notes in Computer Science, vol 9472, pp 529–540. Springer, Berlin. arXiv:1509.05751
  2. Allen BL, Steel M (2001) Subtree transfer operations and their induced metrics on evolutionary trees. Ann Comb 5:1–15MathSciNetCrossRefMATHGoogle Scholar
  3. Benner P, Bačak M, Bourguignon P-Y (2014) Point estimates in phylogenetic reconstructions. Bioinformatics 30:i534–i540CrossRefGoogle Scholar
  4. Berkelaar M et al (2015) lpSolve: Interface to “Lp_solve” v. 5.5 to solve linear/integer programs. R package version 5.6.13. https://CRAN.R-project.org/package=lpSolve
  5. Bernstein DI (2017) L-infinity optimization to Bergman fans of matroids with an application to phylogenetics. arXiv:1702.05141
  6. Bernstein DI, Long C (2017) L-infinity optimization to linear spaces and phylogenetic trees. arXiv:1702.05127
  7. Billera LJ, Holmes SP, Vogtmann K (2001) Geometry of the space of phylogenetic trees. Adv Appl Math 27(4):733–767MathSciNetCrossRefMATHGoogle Scholar
  8. Bogdanowicz D, Giaro K (2012) Matching split distance for unrooted binary phylogenetic trees. IEEE/ACM Trans Comput Biol Bioinform 9(1):150–160CrossRefGoogle Scholar
  9. Bonet ML, St. John K (2010) On the complexity of uSPR distance. IEEE/ACM Trans Comput Biol Bioinform 7(3):572–576CrossRefGoogle Scholar
  10. Bourque M (1978) Arbres de Steiner et reseaux dont certains sommets sont a localisation variable. PhD thesis, MontrealGoogle Scholar
  11. Brodal GS, Fagerberg R, Pedersen CNS (2001) Computing the quartet distance between evolutionary trees on time \({\rm O}(n\log ^2n)\). In: Proceedings of the 12th international symposium on algorithms and computation (ISAAC). Lecture Notes in Computer Science, vol 2223, pp 731–737. SpringerGoogle Scholar
  12. Buneman P (1971) The recovery of trees from measures of dissimilarity. In: Kendall DG, Tautu P (eds) Mathematics in the archeological and historical sciences. Edinburgh University Press, Edinburgh, pp 387–395Google Scholar
  13. Buneman P (1974) A note on the metric properties of trees. J Comb Theory 17(1):48–50MathSciNetCrossRefMATHGoogle Scholar
  14. Burago D, Burago Y, Ivanov S (2001) A course in metric geometry. Graduate studies in mathematics, vol 33. American Mathematical Society, ProvidenceMATHGoogle Scholar
  15. Chakerian J, Holmes S (2017) Distory: distance between phylogenetic histories. R package version 1.4.3. http://CRAN.R-project.org/package=distory
  16. Coons JI, Rusinko J (2016) A note on the path interval distance. J Theor Biol 398:145–149MathSciNetCrossRefMATHGoogle Scholar
  17. Cristina J (2008) Gromov–Hausdorff convergence of metric spaces, Helsinki. http://www.helsinki.fi/~cristina/pdfs/gromovHausdorff.pdf. Accessed 2 Feb 2015
  18. DasGupta B, He X, Jiang T, Li M, Tromp J, Zhang L (1997) On distances between phylogenetic trees. In: Proceedings of the eighth ACM/SIAM symposium discrete algorithms (SODA ’97), pp 427–436Google Scholar
  19. Day WHE (1985) Optimal algorithms for comparing trees with labeled leaves. J Classif 2(1):7–28MathSciNetCrossRefMATHGoogle Scholar
  20. Dress A (1984) Trees, tight extensions of metric spaces, and the cohomological dimension of certain groups: a note on combinatorial properties of metric spaces. Adv Math 53(3):321–402MathSciNetCrossRefMATHGoogle Scholar
  21. Dress A, Holland B, Huber KT, Koolen J, Moulton V, Weyer-Menkoff J (2005) \(\Delta \)-additive and \(\Delta \)-ultra-additive maps, Gromov’s trees and the Farris transform. Discrete Appl Math 146:51–73MathSciNetCrossRefMATHGoogle Scholar
  22. Edwards DA (1975) The structure of superspace. In: Stavrakas NM, Allen KR (eds) Studies in topology. Academic Press, New York, pp 121–133CrossRefGoogle Scholar
  23. Estabrook GF, McMorris FR, Meacham CA (1985) Comparison of undirected phylogenetic trees based on subtrees of four evolutionary units. Syst Zool 34(2):193–200CrossRefGoogle Scholar
  24. Fischer M, Kelk S (2016) On the maximum parsimony distance between phylogenetic trees. Ann Comb 20(1):87–113MathSciNetCrossRefMATHGoogle Scholar
  25. Gavryushkin A, Drummond A (2016) The space of ultrametric phylogenetic trees. J Theor Biol 403:197–208MathSciNetCrossRefMATHGoogle Scholar
  26. Gromov M (1981) Groups of polynomial growth and expanding maps. Publ Math IHÉS 53:53–73MathSciNetCrossRefMATHGoogle Scholar
  27. Guénoche A, Leclerc B, Makarenkov V (2004) On the extension of a partial metric to a tree metric. Discrete Math 276:229–248MathSciNetCrossRefMATHGoogle Scholar
  28. Hoffman AJ, Kruskal J (2010) Introduction to integral boundary points of convex polyhedra. In: Jünger M et al (eds) 50 years of integer programming, 1958–2008. Springer, Berlin, pp 49–50Google Scholar
  29. Huggins P, Owen M, Yoshida R (2012) First steps toward the geometry of cophylogeny. In: Hibi T (ed) Harmony of Gröbner bases and the modern industrial society. World Scientific, Singapore, pp 99–116CrossRefGoogle Scholar
  30. Isbell JR (1964) Six theorems about injective metric spaces. Commun Math Helv 39(1):65–76MathSciNetCrossRefMATHGoogle Scholar
  31. Karmarkar N (1984) A new polynomial-time algorithm for linear programming. Combinatorica 4(4):373–395MathSciNetCrossRefMATHGoogle Scholar
  32. Kelk S, Fischer M (2017) On the complexity of computing MP distance between binary phylogenetic trees. Ann Comb 21(4):573–604MathSciNetCrossRefMATHGoogle Scholar
  33. Kendall M, Colijn C (2016) Mapping phylogenetic trees to reveal distinct patterns of evolution. Mol Biol Evol 33(10):2735–2743CrossRefGoogle Scholar
  34. Lang U, Pavón M, Züst R (2013) Metric stability of trees and tight spans. Arch Math 101(1):91–100MathSciNetCrossRefMATHGoogle Scholar
  35. Liebscher V (2015) gromovlab: Gromov–Hausdorff type distances for labeled metric spaces. R package version 0.7-6. http://CRAN.R-project.org/package=gromovlab
  36. Lin Y, Rajan V, Moret BME (2012) A metric for phylogenetic trees based on matching. IEEE/ACM Trans Comput Biol Bioinform 9(4):1014–1022CrossRefGoogle Scholar
  37. Lin B, Sturmfels B, Tang X, Yoshida R (2017) Convexity in tree spaces. SIAM J Discrete Math 31(3):2015–2038MathSciNetCrossRefMATHGoogle Scholar
  38. Mémoli F (2007) On the use of Gromov–Hausdorff distances for shape comparison. In: Symposium on point based graphics, Prague, Sept 2007Google Scholar
  39. Moulton V, Wu T (2015) A parsimony-based metric for phylogenetic trees. Adv Appl Math 66:22–45MathSciNetCrossRefMATHGoogle Scholar
  40. Nye TMW (2011) Principal components analysis in the space of phylogenetic trees. Ann Stat 39(5):2716–2739MathSciNetCrossRefMATHGoogle Scholar
  41. Owen M, Provan J (2011) A fast algorithm for computing geodesic distances in tree space. IEEE/ACM Trans Comput Biol Bioinform 8(1):2–13CrossRefGoogle Scholar
  42. Paradis E, Claude J, Strimmer K (2004) APE: analyses of phylogenetics and evolution in R language. Bioinformatics 20(2):289–290CrossRefGoogle Scholar
  43. Pardalos PM, Wolkowicz H (eds) (1994) Quadratic assignment and related problems. DIMACS series in discrete mathematics and theoretical computer science, vol 16. AMS, Providence, RI. Papers from the workshop held at Rutgers University, New Brunswick, New Jersey, May 20–21, 1993Google Scholar
  44. Pattengale ND, Gottlieb EJ, Moret BM (2007) Efficiently computing the Robinson–Foulds metric. J Comput Biol 14(6):724–735MathSciNetCrossRefGoogle Scholar
  45. Penny D, Hendy MD (1985) The use of tree comparison metrics. Syst Biol 34(1):75–82CrossRefGoogle Scholar
  46. R Core Team (2017) R: a language and environment for statistical computing. R Foundation for Statistical Computing, version 3.4.3, Vienna, Austria. http://www.R-project.org/
  47. Robinson DF (1971) Comparison of labeled trees with valency three. J Comb Theory 11:105–119MathSciNetCrossRefGoogle Scholar
  48. Robinson DF, Foulds LR (1979) Comparison of weighted labelled trees. In: Combinatorial mathematics VI. Lecture Notes in Mathematics, vol 748, pp 119–126. Springer, BerlinGoogle Scholar
  49. Robinson DF, Foulds LR (1981) Comparison of phylogenetic trees. Math Biosci 53:131–147MathSciNetCrossRefMATHGoogle Scholar
  50. Semple C, Steel MA (2003) Phylogenetics. Oxford University Press, OxfordMATHGoogle Scholar
  51. Sokal RR, Rohlf FJ (1962) The comparison of dendrograms by objective methods. Taxon 11:33–40CrossRefGoogle Scholar
  52. Steel MA, Penny D (1993) Distributions of tree comparison metrics—some new results. Syst Biol 42(2):126–141Google Scholar
  53. Tuzhilin AA (2016) Who invented the Gromov–Hausdorff distance? arXiv:1612.00728
  54. Villar S, Bandeira AS, Blumberg AJ, Ward R (2016) A polynomial-time relaxation of the Gromov–Hausdorff distance. arXiv:1610.05214
  55. Whidden C, Beiko RG, Zeh N (2016) Fixed-parameter and approximation algorithms for maximum agreement forests of multifurcating trees. Algorithmica 74(3):1019–1054MathSciNetCrossRefMATHGoogle Scholar
  56. Williams WT, Clifford HT (1971) On the comparison of two classifications of the same set of elements. Taxon 20:519–522CrossRefGoogle Scholar
  57. Zaretskii KA (1965) Constructing a tree on the basis of a set of distances between the hanging vertices (in Russian). Uspekhi Mat Nauk 20(6):90–92MathSciNetGoogle Scholar

Copyright information

© Society for Mathematical Biology 2017

Authors and Affiliations

  1. 1.Department of Mathematics and Computer ScienceUniversity of GreifswaldGreifswaldGermany

Personalised recommendations