Phylogenies without Branch Bounds: Contracting the Short, Pruning the Deep

Extended Abstract
  • Constantinos Daskalakis
  • Elchanan Mossel
  • Sebastien Roch
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5541)


We introduce a new phylogenetic reconstruction algorithm which, unlike most previous rigorous inference techniques, does not rely on assumptions regarding the branch lengths or the depth of the tree. The algorithm returns a forest which is guaranteed to contain all edges that are: 1) sufficiently long and 2) sufficiently close to the leaves. How much of the true tree is recovered depends on the sequence length provided. The algorithm is distance-based and runs in polynomial time.


Polynomial Time Branch Length Internal Vertex Tree Depth Short Edge 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Felsenstein, J.: Inferring Phylogenies. Sinauer, Sunderland (2004)Google Scholar
  2. 2.
    Semple, C., Steel, M.: Phylogenetics. Mathematics and its Applications series, vol. 22. Oxford University Press, Oxford (2003)Google Scholar
  3. 3.
    Graham, R.L., Foulds, L.R.: Unlikelihood that minimal phylogenies for a realistic biological study can be constructed in reasonable computational time. Math. Biosci. 60, 133–142 (1982)CrossRefGoogle Scholar
  4. 4.
    Day, W.H.E., Sankoff, D.: Computational complexity of inferring phylogenies by compatibility. Syst. Zool. 35(2), 224–229 (1986)CrossRefGoogle Scholar
  5. 5.
    Day, W.H.E.: Computational complexity of inferring phylogenies from dissimilarity matrices. Bull. Math. Biol. 49(4), 461–467 (1987)CrossRefPubMedGoogle Scholar
  6. 6.
    Chor, B., Tuller, T.: Finding a maximum likelihood tree is hard. J. ACM 53(5), 722–744 (2006)CrossRefGoogle Scholar
  7. 7.
    Roch, S.: A short proof that phylogenetic tree reconstruction by maximum likelihood is hard. IEEE/ACM Trans. Comput. Biology Bioinform. 3(1), 92–94 (2006)CrossRefGoogle Scholar
  8. 8.
    Felsenstein, J.: Cases in which parsimony or compatibility methods will be positively misleading. Syst. Biol., 401–410 (1978)Google Scholar
  9. 9.
    Atteson, K.: The performance of neighbor-joining methods of phylogenetic reconstruction. Algorithmica 25(2-3), 251–278 (1999)CrossRefGoogle Scholar
  10. 10.
    Lacey, M.R., Chang, J.T.: A signal-to-noise analysis of phylogeny estimation by neighbor-joining: insufficiency of polynomial length sequences. Math. Biosci. 199(2), 188–215 (2006)CrossRefPubMedGoogle Scholar
  11. 11.
    Steel, M.A., Székely, L.A.: Inverting random functions. Ann. Comb. 3(1), 103–113 (1999); 3 Combinatorics and biology (Los Alamos, NM, 1998)CrossRefGoogle Scholar
  12. 12.
    Steel, M.A., Székely, L.A.: Inverting random functions. II. Explicit bounds for discrete maximum likelihood estimation, with applications. SIAM J. Discrete Math. 15(4), 562–575 (electronic 2002)CrossRefGoogle Scholar
  13. 13.
    Erdös, P.L., Steel, M.A., Székely, L.A., Warnow, T.A.: A few logs suffice to build (almost) all trees (part 1). Random Struct. Algor. 14(2), 153–184 (1999)CrossRefGoogle Scholar
  14. 14.
    Saitou, N., Nei, M.: The neighbor-joining method: A new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4(4), 406–425 (1987)PubMedGoogle Scholar
  15. 15.
    Daskalakis, C., Mossel, E., Roch, S.: Phylogenies without branch bounds: Contracting the short, pruning the deep (2008) (preprint),
  16. 16.
    Chang, J.T.: Full reconstruction of Markov models on evolutionary trees: identifiability and consistency. Math. Biosci. 137(1), 51–73 (1996)CrossRefPubMedGoogle Scholar
  17. 17.
    Philippe, H., Laurent, J.: How good are deep phylogenetic trees? Current Opinion in Genetics & Development 8(8), 616–623 (1998)CrossRefGoogle Scholar
  18. 18.
    Ciccarelli, F.D., Doerks, T., von Mering, C., Creevey, C.J., Snel, B., Bork, P.: Toward Automatic Reconstruction of a Highly Resolved Tree of Life. Science 311(5765), 1283–1287 (2006)CrossRefPubMedGoogle Scholar
  19. 19.
    Mossel, E.: Distorted metrics on trees and phylogenetic forests. IEEE/ACM Trans. Comput. Bio. Bioinform. 4(1), 108–116 (2007)CrossRefGoogle Scholar
  20. 20.
    King, V., Zhang, L., Zhou, Y.: On the complexity of distance-based evolutionary tree reconstruction. In: SODA 2003: Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms, Philadelphia, PA, USA, Society for Industrial and Applied Mathematics, pp. 444–453 (2003)Google Scholar
  21. 21.
    Daskalakis, C., Hill, C., Jaffe, A., Mihaescu, R., Mossel, E., Rao, S.: Maximal accurate forests from distance matrices. In: Apostolico, A., Guerra, C., Istrail, S., Pevzner, P.A., Waterman, M. (eds.) RECOMB 2006. LNCS (LNBI), vol. 3909, pp. 281–295. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  22. 22.
    Gronau, I., Moran, S., Snir, S.: Fast and reliable reconstruction of phylogenetic trees with very short edges. To appear in SODA (2008)Google Scholar
  23. 23.
    Erdös, P.L., Steel, M.A., Székely, L.A., Warnow, T.A.: A few logs suffice to build (almost) all trees (part 2). Theor. Comput. Sci. 221, 77–118 (1999)CrossRefGoogle Scholar
  24. 24.
    Huson, D.H., Nettles, S.H., Warnow, T.J.: Disk-covering, a fast-converging method for phylogenetic tree reconstruction. J. Comput. Biol. 6(3-4) (1999)Google Scholar
  25. 25.
    Csurös, M., Kao, M.Y.: Provably fast and accurate recovery of evolutionary trees through harmonic greedy triplets. SIAM Journal on Computing 31(1), 306–322 (2001)CrossRefGoogle Scholar
  26. 26.
    Csurös, M.: Fast recovery of evolutionary trees with thousands of nodes. J. Comput. Biol. 9(2), 277–297 (2002)CrossRefPubMedGoogle Scholar
  27. 27.
    Mossel, E., Roch, S.: Learning nonsingular phylogenies and hidden Markov models. Ann. Appl. Probab. 16(2), 583–614 (2006)CrossRefGoogle Scholar
  28. 28.
    Daskalakis, C., Mossel, E., Roch, S.: Optimal phylogenetic reconstruction. In: STOC’06: Proceedings of the 38th Annual ACM Symposium on Theory of Computing, pp. 159–168. ACM Press, New York (2006)Google Scholar
  29. 29.
    Buneman, P.: The recovery of trees from measures of dissimilarity. In: Mathematics in the Archaelogical and Historical Sciences, pp. 187–395. Edinburgh University Press, Edinburgh (1971)Google Scholar
  30. 30.
    Meacham, C.A.: A manual method for character compatibility analysis. Taxon 30, 591–600 (1981)CrossRefGoogle Scholar
  31. 31.
    Bandelt, H.J., Dress, A.: Reconstructing the shape of a trea from observed dissimilarity data. Adv. Appl. Math. 7(3), 309–343 (1986)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Constantinos Daskalakis
    • 1
  • Elchanan Mossel
    • 2
  • Sebastien Roch
    • 1
  1. 1.Microsoft ResearchUSA
  2. 2.UC Berkeley and Weizman InstituteUSA

Personalised recommendations