Abstract
Reconciliation is the commonly used method for inferring the evolutionary scenario for a gene family. It consists in “embedding” an inferred gene tree into a known species tree, revealing the evolution of the gene family by duplications and losses. The main complaint about reconciliation is that the inferred evolutionary scenario is strongly dependant on the considered gene tree, as few misplaced leaves may lead to a completely different history, with significantly more duplications and losses. As using different phylogenetic methods with different parameters may lead to different gene trees, it is essential to have criteria to choose, among those, the appropriate one for reconciliation. In this paper, following the conclusion of a previous paper, we flag certain duplication vertices of a gene tree, the “non-apparent duplication” (NAD) vertices, as resulting from the misplacement of leaves, and consider the optimization problem of removing the minimum number of leaves leading to a tree without any NAD vertex. We develop a polynomial-time algorithm that is exact for two special classes of gene trees, and show a good performance on simulated data sets in the general case.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Amir, A., Keselman, D.: Maximum agreement subtree in a set of evolutionary trees: matrics and efficient algorithms. SIAM J. Computing 26, 1656–1669 (1997)
Blomme, T., Vandepoele, K., De Bodt, S., Silmillion, C., Maere, S., van de Peer, Y.: The gain and loss of genes during 600 millions years of vertebrate evolution. Genome Biology 7, R43 (2006)
Bonizzoni, P., Della Vedova, G., Dondi, R.: Reconciling a gene tree to a species tree under the duplication cost model. Theoretical Computer Science 347, 36–53 (2005)
Chauve, C., Doyon, J.-P., El-Mabrouk, N.: Gene family evolution by duplication, speciation and loss. J. Comput. Biol. 15, 1043–1062 (2008)
Chauve, C., El-Mabrouk, N.: New perspectives on gene family evolution: Losses in reconciliation and a link with supertrees. In: Batzoglou, S. (ed.) RECOMB 2009. LNCS, vol. 5541, pp. 46–58. Springer, Heidelberg (2009)
Chen, K., Durand, D., Farach-Colton, M.: Notung: Dating gene duplications using gene family trees. Journal of Computational Biology 7, 429–447 (2000)
Cole, R., Farach-Colton, M., Hariharan, R., Przytycka, T., Thorup, M.: An o(nlogn) algorithm for the maximum agreement subtree problem for binary trees. SIAM J. Computing 30(5), 1385–1404 (2000)
Cotton, J.A., Page, R.D.M.: Rates and patterns of gene duplication and loss in the human genome. Proceedings of the Royal Society of London Series B 272, 277–283 (2005)
Doyon, J.-P., Scornavacca, C., Gorbunov, K., Szolloso, G., Ranwez, V., Berry, V.: An effi. algo. for gene/species trees parsim. reconc. with losses, dup. and transf. J. Comp. Biol. 6398, 93–108 (2010)
Durand, D., Haldórsson, B.V., Vernot, B.: A hybrid micro-macroevolutionary approach to gene tree reconstruction. Journal of Computational Biology 13, 320–335 (2006)
Eichler, E.E., Sankoff, D.: Structural dynamics of eukaryotic chromosome evolution. Science 301, 793–797 (2003)
Finden, C.R., Gordon, A.D.: Obtaining common pruned trees. J. Classification 2, 255–276 (1985)
Goodman, M., Czelusniak, J., Moore, G.W., Romero-Herrera, A.E., Matsuda, G.: Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences. Systematic Zoology 28, 132–163 (1979)
Gorecki, P., Tiuryn, J.: DLS-trees: a model of evolutionary scenarios. Theoretical Computer Science 359, 378–399 (2006)
Guigó, R., Muchnik, I., Smith, T.F.: Reconstruction of ancient molecular phylogeny. Molecular Phylogenetics and Evolution 6, 189–213 (1996)
Hahn, M.W.: Bias in phylogenetic tree reconciliation methods: implications for vertebrate genome evolution. Genome Biology 8(R141) (2007)
Hahn, M.W., Han, M.V., Han, S.-G.: Gene family evolution across 12 drosophilia genomes. PLoS Genetics 3, e197 (2007)
Hallett, M., Lagergren, J., Tofigh, A.: Simultaneous identification of duplications and lateral transfers. In: RECOMB. ACM, New York (2004)
Li, W.H., Gu, Z., Wang, H., Nekrutenko, A.: Evolutionary analysis of the human genome. Nature 409, 847–849 (2001)
Ma, B., Li, M., Zhang, L.: From gene trees to species trees. SIAM J. on Comput. 30, 729–752 (2000)
Ohno, S.: Evolution by gene duplication. Springer, Berlin (1970)
Page, R.D.M.: Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas. Systematic Biology 43, 58–77 (1994)
Page, R.D.M.: Genetree: comparing gene and species phylogenies using reconciled trees. Bioinformatics 14, 819–820 (1998)
Page, R.D.M., Charleston, M.A.: Reconciled trees and incongruent gene and species trees. DIMACS Series in Discrete Mathematics and Theoretical Computer Science 37, 57–70 (1997)
Sanderson, M.J., McMahon, M.M.: Inferring angiosperm phylogeny from EST data with widespread gene duplication. BMC Evolutionary Biology 7, S3 (2007)
Steel, M., Warnow, T.: Kaikoura tree theorems:computing the maximum agreement subtree. Inform. Process. Lett. 48, 77–82 (1993)
Tofigh, A., Hallett, M., Lagergren, J.: Simultaneous identification of duplications and lateral gene transfers. IEEE/ACM Trans. Comput. Biol. Bioinform. 8, 517–535 (2011)
Wapinski, I., Pfeffer, A., Friedman, N., Regev, A.: Natural history and evolutionary principles of gene duplication in fungi. Nature 449, 54–61 (2007)
Zhang, L.X.: On Mirkin-Muchnik-Smith conjecture for comparing molecular phylogenies. Journal of Computational Biology 4, 177–188 (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Doroftei, A., El-Mabrouk, N. (2011). Removing Noise from Gene Trees. In: Przytycka, T.M., Sagot, MF. (eds) Algorithms in Bioinformatics. WABI 2011. Lecture Notes in Computer Science(), vol 6833. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23038-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-23038-7_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23037-0
Online ISBN: 978-3-642-23038-7
eBook Packages: Computer ScienceComputer Science (R0)