New Perspectives on Gene Family Evolution: Losses in Reconciliation and a Link with Supertrees
Reconciliation between a set of gene trees and a species tree is the most commonly used approach to infer the duplication and loss events in the evolution of gene families, given a species tree. When a species tree is not known, a natural algorithmic problem is to infer a species tree such that the corresponding reconciliation minimizes the number of duplications and/or losses. In this paper, we clarify several theoretical questions and study various algorithmic issues related to these two problems. (1) For a given gene tree T and species tree S, we show that there is a single history explaining T and consistent with S that minimizes gene losses, and that this history also minimizes the number of duplications. We describe a simple linear-time and space algorithm to compute this parsimonious history, that is not based on the Lowest Common Ancestor (LCA) mapping approach; (2) We show that the problem of computing a species tree that minimizes the number of gene duplications, given a set of gene trees, is in fact a slight variant of a supertree problem; (3) We show that deciding if a set of gene trees can be explained using only apparent duplications can be done efficiently, as well as computing a parsimonious species tree for such gene trees. We also characterize gene trees that can be explained using only apparent duplications in terms of compatible triplets of leaves.
Unable to display preview. Download preview PDF.
- 3.Arvestad, L., Berglung, A.-C., Lagergren, J., Sennblad, B.: Gene tree reconstruction and orthology analysis based on an integrated model for duplications and sequence evolution. In: RECOMB 2004, pp. 326–335 (2004)Google Scholar
- 6.Bryant, D.: Hunting for trees, building trees and comparing trees: theory and methods in phylogenetic analysis. Ph.D. thesis, Dept. of Math., Univ. of Canterbury, New Zealand (1997)Google Scholar
- 15.Eulenstein, O., Mirkin, B., Vingron, M.: Comparison of annotating duplication, tree mapping, and copying as methods to compare gene trees with species trees. In: Mathematical hierarchies and biology. DIMACS Series Discrete Math. Theoret. Comput. Sci., vol. 37, pp. 71–93 (1997)Google Scholar
- 18.Hallett, M.T., Lagergren, J.: New algorithms for the duplication-loss model. In: RECOMB 2000, pp. 138–146 (2000)Google Scholar
- 24.Page, R.D.M.: Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas. Syst. Biol. 43, 58–77 (1994)Google Scholar