Skip to main content

GTP Supertrees from Unrooted Gene Trees: Linear Time Algorithms for NNI Based Local Searches

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 7292))

Abstract

Gene tree parsimony (GTP) problems infer species supertrees from a collection of rooted gene trees that are confounded by evolutionary events like gene duplication, gene duplication and loss, and deep coalescence. These problems are NP-complete, and consequently, they often are addressed by effective local search heuristics that perform a stepwise search of the tree space, where each step is guided by an exact solution to an instance of a local search problem. Still, GTP problems require rooted input gene trees; however, in practice, most phylogenetic methods infer unrooted gene trees and it may be difficult to root correctly. In this work, we (i) define the first local NNI search problems to solve heuristically the GTP equivalents for unrooted input gene trees, called unrooted GTP problems, and (ii) describe linear time algorithms for these local search problems. We implemented the first NNI based local search heuristics for unrooted GTP problems, which enable analyses for thousands of genes. Further, analysis of a large plant data set using the unrooted NNI search provides support for an intriguing new hypothesis regarding the evolutionary relationships among major groups of flowering plants.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bansal, M.S., Burleigh, J.G., Eulenstein, O., Wehe, A.: Heuristics for the Gene-Duplication Problem: A Θ(n) Speed-Up for the Local Search. In: Speed, T., Huang, H. (eds.) RECOMB 2007. LNCS (LNBI), vol. 4453, pp. 238–252. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  2. Bansal, M.S., Eulenstein, O.: An Ω(n 2/ logn) speed-up of TBR heuristics for the gene-duplication problem. IEEE/ACM TCBB 5(4), 514–524 (2008)

    Google Scholar 

  3. Bansal, M.S., Eulenstein, O., Wehe, A.: The gene-duplication problem: Near-linear time algorithms for NNI-based local searches. IEEE/ACM TCBB 6(2), 221–231 (2009)

    Google Scholar 

  4. Beiko, R.G., Doolittle, W.F., Charlebois, R.L.: The Impact of Reticulate Evolution on Genome Phylogeny. Systematic Biology 57(6), 844–856 (2008)

    Article  Google Scholar 

  5. Bender, M.A., Farach-Colton, M.: The lca Problem Revisited. In: Gonnet, G.H., Viola, A. (eds.) LATIN 2000. LNCS, vol. 1776, pp. 88–94. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  6. Bininda-Emonds, O.R.P.: Phylogenetic supertrees: combining information to reveal the tree of life (2004)

    Google Scholar 

  7. Bouchenak-Khelladi, Y., Salamin, N., Savolainen, V., Forest, F., Bank, M., Chase, M.W., Hodkinson, T.R.: Large multi-gene phylogenetic trees of the grasses (poaceae): progress towards complete tribal and generic level sampling. Mol. Phyl. Evol. 47(2), 488–505 (2008)

    Article  Google Scholar 

  8. Burleigh, J.G., Bansal, M.S., Eulenstein, O., Hartmann, S., Wehe, A., Vision, T.J.: Genome-scale phylogenetics: inferring the plant tree of life from 18,896 discordant gene trees. Systematic Biology 60, 117–125 (2011)

    Article  Google Scholar 

  9. Delsuc, F., Brinkmann, H., Philippe, H.: Phylogenomics and the reconstruction of the tree of life. Nature Reviews Genetics 6(5), 361–375 (2005)

    Article  Google Scholar 

  10. Edgar, R.C.: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32, 1792–1797 (2004)

    Article  Google Scholar 

  11. Eulenstein, O., Huzurbazar, S., Liberles, D.A.: Reconciling phylogenetic trees. In: Dittmar, Liberles (eds.) Evolution After Gene Duplication. Wiley (2010)

    Google Scholar 

  12. Goodman, M., Czelusniak, J., Moore, G.W., Romero-Herrera, A.E., Matsuda, G.: Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences. Systematic Zoology 28(2), 132–163 (1979)

    Article  Google Scholar 

  13. Górecki, P., Tiuryn, J.: Inferring phylogeny from whole genomes. Bioinformatics 23(2), e116–e222 (2007)

    Article  Google Scholar 

  14. Guigó, R., Muchnik, I., Smith, T.F.: Reconstruction of ancient molecular phylogeny. Molecular Phylogenetics and Evolution 6(2), 189–213 (1996)

    Article  Google Scholar 

  15. Holland, B.R., Penny, D., Hendy, M.D.: Outgroup misplacement and phylogenetic inaccuracy under a molecular clock a simulation study. Syst. Biol. 52, 229–238 (2003)

    Article  Google Scholar 

  16. Huelsenbeck, J.P., Bollback, J.P., Levine, A.M.: Inferring the Root of a Phylogenetic Tree. Systematic Biology 51(1), 32–43 (2002)

    Article  Google Scholar 

  17. Jones, D.T., Taylor, W.R., Thornton, J.M.: The rapid generation of mutation data matrices from protein sequences. Computer Applications in the Biosciences 8, 275–282 (1992)

    Google Scholar 

  18. Kubatko, L.S., Degnan, J.H.: Inconsistency of Phylogenetic Estimates from Concatenated Data under Coalescence. Syst. Biol. 56(1), 17–24 (2007)

    Article  Google Scholar 

  19. Ma, B., Li, M., Zhang, L.: From gene trees to species trees. SIAM Journal on Computing 30(3), 729–752 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  20. Maddison, W.P.: Gene trees in species trees. Systematic Biology 46, 523–536 (1997)

    Article  Google Scholar 

  21. Moore, M.J., Soltis, P.S., Bell, C.D., Burleigh, J.G., Soltis, D.E.: Phylogenetic analysis of 83 plastid genes further resolves the early diversification of eudicots. Proceedings of the National Academy of Sciences 107(10), 4623–4628 (2010)

    Article  Google Scholar 

  22. Mossel, E., Vigoda, E.: Phylogenetic MCMC algorithms are misleading on mixtures of trees. Science 309(5744), 2207–2209 (2005)

    Article  Google Scholar 

  23. Page, R.D.M.: Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas. Systematic Biology 43(1), 58–77 (1994)

    Google Scholar 

  24. Qiu, Y., Li, L., Wang, B., Xue, J., Hendry, T.A., Li, R., Brown, J.W., Liu, Y., Hudson, G.T., Chen, Z.: Angiosperm phylogeny inferred from sequences of four mitochondrial genes. Journal of Systematics and Evolution 48(6), 391–425 (2010)

    Article  Google Scholar 

  25. Rouard, M., Guignon, V., Aluome, C., Laporte, M., Droc, G., Walde, C., Zmasek, C.M., Périn, C., Conte, M.G.: Greenphyldb v2.0: comparative and functional genomics in plants. Nucleic Acids Research 39, D1095–D1102 (2010)

    Article  Google Scholar 

  26. Sanderson, M., Michelle, M.: Inferring angiosperm phylogeny from est data with widespread gene duplication. BMC Evolutionary Biology 7(suppl.1) (2007)

    Google Scholar 

  27. Soltis, D.E., Smith, S.A., Cellinese, N., Wurdack, K.J., Tank, D.C., Brockington, S.F., Refulio-Rodriguez, N.F., Walker, J.B., Moore, M.J., Carlsward, B.S., Bell, C.D., Latvis, M., Crawley, S., Black, C., Diouf, D., Xi, Z., Rushworth, C.A., Gitzendanner, M.A., Sytsma, K.J., Qiu, Y., Hilu, K.W., Davis, C.C., Sanderson, M.J., Beaman, R.S., Olmstead, R.G., Judd, W.S., Donoghue, M.J., Soltis, P.S.: Angiosperm phylogeny: 17 genes, 640 taxa. American Journal of Botany 98(4), 704–730 (2011)

    Article  Google Scholar 

  28. Stamatakis, A.: RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22(21), 2688–2690 (2006)

    Article  Google Scholar 

  29. Yu, Y., Warnow, T., Nakhleh, L.: Algorithms for MDC-Based Multi-locus Phylogeny Inference. In: Bafna, V., Sahinalp, S.C. (eds.) RECOMB 2011. LNCS, vol. 6577, pp. 531–545. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  30. Zhang, L.: From gene trees to species trees ii: Species tree inference by minimizing deep coalescence events. IEEE/ACM TCBB 8, 1685–1691 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Górecki, P., Burleigh, J.G., Eulenstein, O. (2012). GTP Supertrees from Unrooted Gene Trees: Linear Time Algorithms for NNI Based Local Searches. In: Bleris, L., Măndoiu, I., Schwartz, R., Wang, J. (eds) Bioinformatics Research and Applications. ISBRA 2012. Lecture Notes in Computer Science(), vol 7292. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30191-9_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-30191-9_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-30190-2

  • Online ISBN: 978-3-642-30191-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics